jfriedt / IFCS2018 article

% fusionner max rejection a surface donnee v.s minimiser surface a rejection donnee

1

% fusionner max rejection a surface donnee v.s minimiser surface a rejection donnee

% demontrer comment la quantification rejette du bruit vers les hautes frequences => 6 dB de

2

% demontrer comment la quantification rejette du bruit vers les hautes frequences => 6 dB de

% rejection par bit et perte si moins de bits que rejection/6

3

% rejection par bit et perte si moins de bits que rejection/6

% developper programme lineaire en incluant le decalage de bits

4

% developper programme lineaire en incluant le decalage de bits

% insister que avant on etait synthetisable mais pas implementable, alors que maintenant on

5

% insister que avant on etait synthetisable mais pas implementable, alors que maintenant on

% implemente et on demontre que ca tourne

6

% implemente et on demontre que ca tourne

% gwen : pourquoi le FIR est desormais implementable et ne l'etait pas meme sur zedboard->new FIR ?

7

% gwen : pourquoi le FIR est desormais implementable et ne l'etait pas meme sur zedboard->new FIR ?

% Gwen : peut-on faire un vrai banc de bruit de phase avec ce FIR, ie ajouter ADC, NCO et mixer

8

% Gwen : peut-on faire un vrai banc de bruit de phase avec ce FIR, ie ajouter ADC, NCO et mixer

% (zedboard ou redpit)

9

% (zedboard ou redpit)

10

% ajouter pyramide "juste"

11

% ajouter pyramide "juste"

% label schema : verifier que "argumenter de la cascade de FIR" est fait

12

% label schema : verifier que "argumenter de la cascade de FIR" est fait

13

\documentclass[a4paper,conference]{IEEEtran/IEEEtran}

14

\documentclass[a4paper,conference]{IEEEtran/IEEEtran}

\usepackage{graphicx,color,hyperref}

15

\usepackage{graphicx,color,hyperref}

\usepackage{amsfonts}

16

\usepackage{amsfonts}

\usepackage{amsthm}

17

\usepackage{amsthm}

\usepackage{amssymb}

18

\usepackage{amssymb}

\usepackage{amsmath}

19

\usepackage{amsmath}

\usepackage{algorithm2e}

20

\usepackage{algorithm2e}

\usepackage{url,balance}

21

\usepackage{url,balance}

\usepackage[normalem]{ulem}

22

\usepackage[normalem]{ulem}

\usepackage{tikz}

23

\usepackage{tikz}

\usetikzlibrary{positioning,fit}

24

\usetikzlibrary{positioning,fit}

\usepackage{multirow}

25

\usepackage{multirow}

\usepackage{scalefnt}

26

\usepackage{scalefnt}

27

% correct bad hyphenation here

28

% correct bad hyphenation here

\hyphenation{op-tical net-works semi-conduc-tor}

29

\hyphenation{op-tical net-works semi-conduc-tor}

\textheight=26cm

30

\textheight=26cm

\setlength{\footskip}{30pt}

31

\setlength{\footskip}{30pt}

\pagenumbering{gobble}

32

\pagenumbering{gobble}

\begin{document}

33

\begin{document}

\title{Filter optimization for real time digital processing of radiofrequency signals: application

34

\title{Filter optimization for real time digital processing of radiofrequency signals: application

to oscillator metrology}

35

to oscillator metrology}

36

\author{\IEEEauthorblockN{A. Hugeat\IEEEauthorrefmark{1}\IEEEauthorrefmark{2}, J. Bernard\IEEEauthorrefmark{2},

37

\author{\IEEEauthorblockN{A. Hugeat\IEEEauthorrefmark{1}\IEEEauthorrefmark{2}, J. Bernard\IEEEauthorrefmark{2},

G. Goavec-M\'erou\IEEEauthorrefmark{1},

38

G. Goavec-M\'erou\IEEEauthorrefmark{1},

P.-Y. Bourgeois\IEEEauthorrefmark{1}, J.-M. Friedt\IEEEauthorrefmark{1}}

39

P.-Y. Bourgeois\IEEEauthorrefmark{1}, J.-M. Friedt\IEEEauthorrefmark{1}}

\IEEEauthorblockA{\IEEEauthorrefmark{1}FEMTO-ST, Time \& Frequency department, Besan\c con, France }

40

\IEEEauthorblockA{\IEEEauthorrefmark{1}FEMTO-ST, Time \& Frequency department, Besan\c con, France }

\IEEEauthorblockA{\IEEEauthorrefmark{2}FEMTO-ST, Computer Science department DISC, Besan\c con, France \\

41

\IEEEauthorblockA{\IEEEauthorrefmark{2}FEMTO-ST, Computer Science department DISC, Besan\c con, France \\

Email: \{pyb2,jmfriedt\}@femto-st.fr}

42

Email: \{pyb2,jmfriedt\}@femto-st.fr}

}

43

}

\maketitle

44

\maketitle

\thispagestyle{plain}

45

\thispagestyle{plain}

\pagestyle{plain}

46

\pagestyle{plain}

\newtheorem{definition}{Definition}

47

\newtheorem{definition}{Definition}

48

\begin{abstract}

49

\begin{abstract}

Software Defined Radio (SDR) provides stability, flexibility and reconfigurability to

50

Software Defined Radio (SDR) provides stability, flexibility and reconfigurability to

radiofrequency signal processing. Applied to oscillator characterization in the context

51

radiofrequency signal processing. Applied to oscillator characterization in the context

of ultrastable clocks, stringent filtering requirements are defined by spurious signal or

52

of ultrastable clocks, stringent filtering requirements are defined by spurious signal or

noise rejection needs. Since real time radiofrequency processing must be performed in a

53

noise rejection needs. Since real time radiofrequency processing must be performed in a

Field Programmable Array to meet timing constraints, we investigate optimization strategies

54

Field Programmable Array to meet timing constraints, we investigate optimization strategies

to design filters meeting rejection characteristics while limiting the hardware resources

55

to design filters meeting rejection characteristics while limiting the hardware resources

required and keeping timing constraints within the targeted measurement bandwidths.

56

required and keeping timing constraints within the targeted measurement bandwidths.

\end{abstract}

57

\end{abstract}

58

\begin{IEEEkeywords}

59

\begin{IEEEkeywords}

Software Defined Radio, Mixed-Integer Linear Programming, Finite Impulse Response filter

60

Software Defined Radio, Mixed-Integer Linear Programming, Finite Impulse Response filter

\end{IEEEkeywords}

61

\end{IEEEkeywords}

62

\section{Digital signal processing of ultrastable clock signals}

63

\section{Digital signal processing of ultrastable clock signals}

64

Analog oscillator phase noise characteristics are classically performed by downconverting

65

Analog oscillator phase noise characteristics are classically performed by downconverting

the radiofrequency signal using a saturated mixer to bring the radiofrequency signal to baseband,

66

the radiofrequency signal using a saturated mixer to bring the radiofrequency signal to baseband,

followed by a Fourier analysis of the beat signal to analyze phase fluctuations close to carrier. In

67

followed by a Fourier analysis of the beat signal to analyze phase fluctuations close to carrier. In

a fully digital approach, the radiofrequency signal is digitized and numerically downconverted by

68

a fully digital approach, the radiofrequency signal is digitized and numerically downconverted by

multiplying the samples with a local numerically controlled oscillator (Fig. \ref{schema}) \cite{rsi}.

69

multiplying the samples with a local numerically controlled oscillator (Fig. \ref{schema}) \cite{rsi}.

70

\begin{figure}[h!tb]

71

\begin{figure}[h!tb]

\begin{center}

72

\begin{center}

\includegraphics[width=.8\linewidth]{images/schema}

73

\includegraphics[width=.8\linewidth]{images/schema}

\end{center}

74

\end{center}

\caption{Fully digital oscillator phase noise characterization: the Device Under Test

75

\caption{Fully digital oscillator phase noise characterization: the Device Under Test

(DUT) signal is sampled by the radiofrequency grade Analog to Digital Converter (ADC) and

76

(DUT) signal is sampled by the radiofrequency grade Analog to Digital Converter (ADC) and

downconverted by mixing with a Numerically Controlled Oscillator (NCO). Unwanted signals

77

downconverted by mixing with a Numerically Controlled Oscillator (NCO). Unwanted signals

and noise aliases are rejected by a Low Pass Filter (LPF) implemented as a cascade of Finite

78

and noise aliases are rejected by a Low Pass Filter (LPF) implemented as a cascade of Finite

Impulse Response (FIR) filters. The signal is then decimated before a Fourier analysis displays

79

Impulse Response (FIR) filters. The signal is then decimated before a Fourier analysis displays

the spectral characteristics of the phase fluctuations.}

80

the spectral characteristics of the phase fluctuations.}

\label{schema}

81

\label{schema}

\end{figure}

82

\end{figure}

83

As with the analog mixer,

84

As with the analog mixer,

the non-linear behavior of the downconverter introduces noise or spurious signal aliasing as

85

the non-linear behavior of the downconverter introduces noise or spurious signal aliasing as

well as the generation of the frequency sum signal in addition to the frequency difference.

86

well as the generation of the frequency sum signal in addition to the frequency difference.

These unwanted spectral characteristics must be rejected before decimating the data stream

87

These unwanted spectral characteristics must be rejected before decimating the data stream

for the phase noise spectral characterization \cite{andrich2018high}. The characteristics introduced between the

88

for the phase noise spectral characterization \cite{andrich2018high}. The characteristics introduced between the

downconverter

89

downconverter

and the decimation processing blocks are core characteristics of an oscillator characterization

90

and the decimation processing blocks are core characteristics of an oscillator characterization

system, and must reject out-of-band signals below the targeted phase noise -- typically in the

91

system, and must reject out-of-band signals below the targeted phase noise -- typically in the

sub -170~dBc/Hz for ultrastable oscillator we aim at characterizing. The filter blocks will

92

sub -170~dBc/Hz for ultrastable oscillator we aim at characterizing. The filter blocks will

use most resources of the Field Programmable Gate Array (FPGA) used to process the radiofrequency

93

use most resources of the Field Programmable Gate Array (FPGA) used to process the radiofrequency

datastream: optimizing the performance of the filter while reducing the needed resources is

94

datastream: optimizing the performance of the filter while reducing the needed resources is

hence tackled in a systematic approach using optimization techniques. Most significantly, we

95

hence tackled in a systematic approach using optimization techniques. Most significantly, we

tackle the issue by attempting to cascade multiple Finite Impulse Response (FIR) filters with

96

tackle the issue by attempting to cascade multiple Finite Impulse Response (FIR) filters with

tunable number of coefficients and tunable number of bits representing the coefficients and the

97

tunable number of coefficients and tunable number of bits representing the coefficients and the

data being processed.

98

data being processed.

99

\section{Finite impulse response filter}

100

\section{Finite impulse response filter}

101

We select FIR filter for their unconditional stability and ease of design. A FIR filter is defined

102

We select FIR filter for their unconditional stability and ease of design. A FIR filter is defined

by a set of weights $b_k$ applied to the inputs $x_k$ through a convolution to generate the

103

by a set of weights $b_k$ applied to the inputs $x_k$ through a convolution to generate the

outputs $y_k$

104

outputs $y_k$

\begin{align}

105

\begin{align}

y_n=\sum_{k=0}^N b_k x_{n-k}

106

y_n=\sum_{k=0}^N b_k x_{n-k}

\label{eq:fir_equation}

107

\label{eq:fir_equation}

\end{align}

108

\end{align}

109

As opposed to an implementation on a general purpose processor in which word size is defined by the

110

As opposed to an implementation on a general purpose processor in which word size is defined by the

processor architecture, implementing such a filter on an FPGA offer more degrees of freedom since

111

processor architecture, implementing such a filter on an FPGA offer more degrees of freedom since

not only the coefficient values and number of taps must be defined, but also the number of bits

112

not only the coefficient values and number of taps must be defined, but also the number of bits

defining the coefficients and the sample size. For this reason, and because we consider pipeline

113

defining the coefficients and the sample size. For this reason, and because we consider pipeline

processing (as opposed to First-In, First-Out FIFO memory batch processing) of radiofrequency

114

processing (as opposed to First-In, First-Out FIFO memory batch processing) of radiofrequency

signals, High Level Synthesis (HLS) languages \cite{kasbah2008multigrid} are not considered but

115

signals, High Level Synthesis (HLS) languages \cite{kasbah2008multigrid} are not considered but

the problem is tackled at the Very-high-speed-integrated-circuit Hardware Description Language (VHDL) level.

116

the problem is tackled at the Very-high-speed-integrated-circuit Hardware Description Language (VHDL) level.

Since latency is not an issue in a openloop phase noise characterization instrument, the large

117

Since latency is not an issue in a openloop phase noise characterization instrument, the large

numbre of taps in the FIR, as opposed to the shorter Infinite Impulse Response (IIR) filter,

118

numbre of taps in the FIR, as opposed to the shorter Infinite Impulse Response (IIR) filter,

is not considered as an issue as would be in a closed loop system.

119

is not considered as an issue as would be in a closed loop system.

120

The coefficients are classically expressed as floating point values. However, this binary

121

The coefficients are classically expressed as floating point values. However, this binary

number representation is not efficient for fast arithmetic computation by an FPGA. Instead,

122

number representation is not efficient for fast arithmetic computation by an FPGA. Instead,

we select to quantify these floating point values into integer values. This quantization

123

we select to quantify these floating point values into integer values. This quantization

will result in some precision loss.

124

will result in some precision loss.

125

\begin{figure}[h!tb]

126

\begin{figure}[h!tb]

\includegraphics[width=\linewidth]{images/zero_values}

127

\includegraphics[width=\linewidth]{images/zero_values}

\caption{Impact of the quantization resolution of the coefficients: the quantization is

128

\caption{Impact of the quantization resolution of the coefficients: the quantization is

set to 6~bits -- with the horizontal black lines indicating $\pm$1 least significant bit -- setting

129

set to 6~bits -- with the horizontal black lines indicating $\pm$1 least significant bit -- setting

the 30~first and 30~last coefficients out of the initial 128~band-pass

130

the 30~first and 30~last coefficients out of the initial 128~band-pass

filter coefficients to 0 (red dots).}

131

filter coefficients to 0 (red dots).}

\label{float_vs_int}

132

\label{float_vs_int}

\end{figure}

133

\end{figure}

134

The tradeoff between quantization resolution and number of coefficients when considering

135

The tradeoff between quantization resolution and number of coefficients when considering

integer operations is not trivial. As an illustration of the issue related to the

136

integer operations is not trivial. As an illustration of the issue related to the

relation between number of fiter taps and quantization, Fig. \ref{float_vs_int} exhibits

137

relation between number of fiter taps and quantization, Fig. \ref{float_vs_int} exhibits

a 128-coefficient FIR bandpass filter designed using floating point numbers (blue). Upon

138

a 128-coefficient FIR bandpass filter designed using floating point numbers (blue). Upon

quantization on 6~bit integers, 60 of the 128~coefficients in the beginning and end of the

139

quantization on 6~bit integers, 60 of the 128~coefficients in the beginning and end of the

taps become null, making the large number of coefficients irrelevant and allowing to save

140

taps become null, making the large number of coefficients irrelevant and allowing to save

processing resource by shrinking the filter length. This tradeoff aimed at minimizing resources

141

processing resource by shrinking the filter length. This tradeoff aimed at minimizing resources

to reach a given rejection level, or maximizing out of band rejection for a given computational

142

to reach a given rejection level, or maximizing out of band rejection for a given computational

resource, will drive the investigation on cascading filters designed with varying tap resolution

143

resource, will drive the investigation on cascading filters designed with varying tap resolution

and tap length, as will be shown in the next section. Indeed, our development strategy closely

144

and tap length, as will be shown in the next section. Indeed, our development strategy closely

follows the skeleton approach \cite{crookes1998environment, crookes2000design, benkrid2002towards}

145

follows the skeleton approach \cite{crookes1998environment, crookes2000design, benkrid2002towards}

in which basic blocks are defined and characterized before being assembled \cite{hide}

146

in which basic blocks are defined and characterized before being assembled \cite{hide}

in a complete processing chain. In our case, assembling the filter blocks is a simpler block

147

in a complete processing chain. In our case, assembling the filter blocks is a simpler block

combination process since we assume a single value to be processed and a single value to be

148

combination process since we assume a single value to be processed and a single value to be

generated at each clock cycle. The FIR filters will not be considered to decimate in the

149

generated at each clock cycle. The FIR filters will not be considered to decimate in the

current implementation: the decimation is assumed to be located after the FIR cascade at the

150

current implementation: the decimation is assumed to be located after the FIR cascade at the

moment.

151

moment.

152

\section{Methodology description}

153

\section{Methodology description}

We want create a new methodology to develop any Digital Signal Processing (DSP) chain

154

We want create a new methodology to develop any Digital Signal Processing (DSP) chain

and for any hardware platform (Altera, Xilinx...). To do this we have defined an

155

and for any hardware platform (Altera, Xilinx...). To do this we have defined an

abstract model to represent some basic operations of DSP.

156

abstract model to represent some basic operations of DSP.

157

For the moment, we are focused on only two operations: the filtering and the shifting of data.

158

For the moment, we are focused on only two operations: the filtering and the shifting of data.

We have chosen this basic operation because the shifting and the filtering have already be studied in

159

We have chosen this basic operation because the shifting and the filtering have already be studied in

lot of works \cite{lim_1996, lim_1988, young_1992, smith_1998} hence it will be easier

160

lot of works \cite{lim_1996, lim_1988, young_1992, smith_1998} hence it will be easier

to check and validate our results.

161

to check and validate our results.

162

However having only two operations is insufficient to work with complex DSP but

163

However having only two operations is insufficient to work with complex DSP but

in this paper we only want demonstrate the relevance and the efficiency of our approach.

164

in this paper we only want demonstrate the relevance and the efficiency of our approach.

In future work it will be possible to add more operations and we are able to

165

In future work it will be possible to add more operations and we are able to

model any DSP chain.

166

model any DSP chain.

167

We will apply our methodology on very simple DSP chain. We generate a digital signal

168

We will apply our methodology on very simple DSP chain. We generate a digital signal

thanks at generator of Pseudo-Random Number (PRN) or thanks at an Analog to Digital

169

thanks at generator of Pseudo-Random Number (PRN) or thanks at an Analog to Digital

Converter (ADC). Once we have a digital signal, we filter it to decrease the noise level.

170

Converter (ADC). Once we have a digital signal, we filter it to decrease the noise level.

Finally we stored some burst of filtered samples before post-processing it.

171

Finally we stored some burst of filtered samples before post-processing it.

% TODO: faire un schéma

172

In this particular case, we want optimize the filtering step to have the best noise

173

172

In this particular case, we want optimize the filtering step to have the best noise

rejection for constrain number of resource or to have the minimal resources

174

173

rejection for constrain number of resource or to have the minimal resources

consumption for a given rejection objective.

175

174

consumption for a given rejection objective.

176

175

The first step of our approach is to model the DSP chain and since we just optimize

177

176

The first step of our approach is to model the DSP chain and since we just optimize

the filtering, we have not modeling the PRN generator or the ADC. The filtering can be

178

177

the filtering, we have not modeling the PRN generator or the ADC. The filtering can be

done by two ways. The first one we use only one FIR filter with lot of coefficients

179

178

done by two ways. The first one we use only one FIR filter with lot of coefficients

to rejection the noise, we called this approach a monolithic approach. And the second one

180

179

to rejection the noise, we called this approach a monolithic approach. And the second one

we select different FIR filters with less coefficients the monolithic filter and we cascaded

181

180

we select different FIR filters with less coefficients the monolithic filter and we cascaded

it to filtering the signal.

182

181

it to filtering the signal.

183

182

After each filter we leave the possibility of shifting the filtered data to consume

184

183

After each filter we leave the possibility of shifting the filtered data to consume

less resources. Hence in the case of cascaded filter, we define a stage as a filter

185

184

less resources. Hence in the case of cascaded filter, we define a stage as a filter

and a shifter (the shift could be omitted if we do not need to divide the filtered data).

186

185

and a shifter (the shift could be omitted if we do not need to divide the filtered data).

187

186

\subsection{Model of a FIR filter}

188

187

\subsection{Model of a FIR filter}

A cascade of filter are composed of $n$ stage. In stage $i$ ($1 \leq i \leq n$)

189

188

A cascade of filter are composed of $n$ stage. In stage $i$ ($1 \leq i \leq n$)

the FIR has $C_i$ coefficients and each coefficients are integer values with $\pi^C_i$

190

189

the FIR has $C_i$ coefficients and each coefficients are integer values with $\pi^C_i$

bits and the filtered data are shifted of $\pi^S_i$ bits. We define also $\pi^-_i$ as

191

190

bits and the filtered data are shifted of $\pi^S_i$ bits. We define also $\pi^-_i$ as

the size of input data and $\pi^+_i$ as the size of output data. The figure~\ref{fig:fir_stage}

192

191

the size of input data and $\pi^+_i$ as the size of output data. The figure~\ref{fig:fir_stage}

shows a filtering stage.

193

192

shows a filtering stage.

194

193

\begin{figure}

195

194

\begin{figure}

\centering

196

195

\centering

\begin{tikzpicture}[node distance=2cm]

197

196

\begin{tikzpicture}[node distance=2cm]

\node[draw,minimum size=1.3cm] (FIR) { $C_i, \pi_i^C$ } ;

198

197

\node[draw,minimum size=1.3cm] (FIR) { $C_i, \pi_i^C$ } ;

\node[draw,minimum size=1.3cm] (Shift) [right of=FIR, ] { $\pi_i^S$ } ;

199

198

\node[draw,minimum size=1.3cm] (Shift) [right of=FIR, ] { $\pi_i^S$ } ;

\node (Start) [left of=FIR] { } ;

200

199

\node (Start) [left of=FIR] { } ;

\node (End) [right of=Shift] { } ;

201

200

\node (End) [right of=Shift] { } ;

202

201

\node[draw,fit=(FIR) (Shift)] (Filter) { } ;

203

202

\node[draw,fit=(FIR) (Shift)] (Filter) { } ;

204

203

\draw[->] (Start) edge node [above] { $\pi_i^-$ } (FIR) ;

205

204

\draw[->] (Start) edge node [above] { $\pi_i^-$ } (FIR) ;

\draw[->] (FIR) -- (Shift) ;

206

205

\draw[->] (FIR) -- (Shift) ;

\draw[->] (Shift) edge node [above] { $\pi_i^+$ } (End) ;

207

206

\draw[->] (Shift) edge node [above] { $\pi_i^+$ } (End) ;

\end{tikzpicture}

208

207

\end{tikzpicture}

\caption{A single filter is composed of a FIR (on the left) and a Shifter (on the right)}

209

208

\caption{A single filter is composed of a FIR (on the left) and a Shifter (on the right)}

\label{fig:fir_stage}

210

209

\label{fig:fir_stage}

\end{figure}

211

210

\end{figure}

212

211

FIR $i$ can reject $F(C_i, \pi_i^C)$ dB. $F$ is determined numerically.

213

212

FIR $i$ can reject $F(C_i, \pi_i^C)$ dB. $F$ is determined numerically.

To measure this rejection, we use GNU Octave software to design FIR filter coefficients thanks to two

214

213

To measure this rejection, we use GNU Octave software to design FIR filter coefficients thanks to two

algorithms (\texttt{firls} and \texttt{fir1}).

215

214

algorithms (\texttt{firls} and \texttt{fir1}).

For each configuration $(C_i, \pi_i^C)$, we first create a FIR with floating point coefficients and a given $C_i$ number of coefficients.

216

215

For each configuration $(C_i, \pi_i^C)$, we first create a FIR with floating point coefficients and a given $C_i$ number of coefficients.

Then, the floating point coefficients are discretized into integers. In order to ensure that the coefficients are coded on $\pi_i^C$~bits effectively,

217

216

Then, the floating point coefficients are discretized into integers. In order to ensure that the coefficients are coded on $\pi_i^C$~bits effectively,

the coefficients are normalized by their absolute maximum before being scaled to integer coefficients.

218

217

the coefficients are normalized by their absolute maximum before being scaled to integer coefficients.

At least one coefficient is coded on $\pi_i^C$~bits, and in practice only $b_{C_i/2}$ is coded on $\pi_i^C$~bits while the other are coded on very fewer bits.

219

218

At least one coefficient is coded on $\pi_i^C$~bits, and in practice only $b_{C_i/2}$ is coded on $\pi_i^C$~bits while the other are coded on very fewer bits.

220

219

With these coefficients, the \texttt{freqz} function is used to estimate the magnitude of the filter.

221

220

With these coefficients, the \texttt{freqz} function is used to estimate the magnitude of the filter.

Comparing the performance between FIRs requires however a unique criterion. As shown in figure~\ref{fig:fir_mag},

222

221

Comparing the performance between FIRs requires however a unique criterion. As shown in figure~\ref{fig:fir_mag},

the FIR magnitude exhibits two parts.

223

222

the FIR magnitude exhibits two parts.

224

223

\begin{figure}

225

224

\begin{figure}

\centering

226

225

\centering

\begin{tikzpicture}[scale=0.3]

227

226

\begin{tikzpicture}[scale=0.3]

\draw[<->] (0,15) -- (0,0) -- (21,0) ;

228

227

\draw[<->] (0,15) -- (0,0) -- (21,0) ;

\draw[thick] (0,12) -- (8,12) -- (20,0) ;

229

228

\draw[thick] (0,12) -- (8,12) -- (20,0) ;

230

229

\draw (0,14) node [left] { $P$ } ;

231

230

\draw (0,14) node [left] { $P$ } ;

\draw (20,0) node [below] { $f$ } ;

232

231

\draw (20,0) node [below] { $f$ } ;

233

232

\draw[>=latex,<->] (0,14) -- (8,14) ;

234

233

\draw[>=latex,<->] (0,14) -- (8,14) ;

\draw (4,14) node [above] { passband } node [below] { $40\%$ } ;

235

234

\draw (4,14) node [above] { passband } node [below] { $40\%$ } ;

236

235

\draw[>=latex,<->] (8,14) -- (12,14) ;

237

236

\draw[>=latex,<->] (8,14) -- (12,14) ;

\draw (10,14) node [above] { transition } node [below] { $20\%$ } ;

238

237

\draw (10,14) node [above] { transition } node [below] { $20\%$ } ;

239

238

\draw[>=latex,<->] (12,14) -- (20,14) ;

240

239

\draw[>=latex,<->] (12,14) -- (20,14) ;

\draw (16,14) node [above] { stopband } node [below] { $40\%$ } ;

241

240

\draw (16,14) node [above] { stopband } node [below] { $40\%$ } ;

242

241

\draw[>=latex,<->] (16,12) -- (16,8) ;

243

242

\draw[>=latex,<->] (16,12) -- (16,8) ;

\draw (16,10) node [right] { rejection } ;

244

243

\draw (16,10) node [right] { rejection } ;

245

244

\draw[dashed] (8,-1) -- (8,14) ;

246

245

\draw[dashed] (8,-1) -- (8,14) ;

\draw[dashed] (12,-1) -- (12,14) ;

247

246

\draw[dashed] (12,-1) -- (12,14) ;

248

247

\draw[dashed] (8,12) -- (16,12) ;

249

248

\draw[dashed] (8,12) -- (16,12) ;

\draw[dashed] (12,8) -- (16,8) ;

250

249

\draw[dashed] (12,8) -- (16,8) ;

251

250

\end{tikzpicture}

252

251

\end{tikzpicture}

253

% \includegraphics[width=.5\linewidth]{images/fir_magnitude}

254

\caption{Shape of the filter transmitted power $P$ as a function of frequency $f$:

255

252

\caption{Shape of the filter transmitted power $P$ as a function of frequency $f$:

the passband is considered to occupy the initial 40\% of the Nyquist frequency range,

256

253

the passband is considered to occupy the initial 40\% of the Nyquist frequency range,

the stopband the last 40\%, allowing 20\% transition width.}

257

254

the stopband the last 40\%, allowing 20\% transition width.}

\label{fig:fir_mag}

258

255

\label{fig:fir_mag}

\end{figure}

259

256

\end{figure}

260

257

In the transition band, the behavior of the filter is left free, we only care about the passband and the stopband.

261

258

In the transition band, the behavior of the filter is left free, we only care about the passband and the stopband.

Our first criterion considers the mean value of the stopband rejection, as shown in figure~\ref{fig:mean_criterion}. This criterion does not work because we do not consider the shape of the passband.

262

259

Our first criterion considers the mean value of the stopband rejection, as shown in figure~\ref{fig:mean_criterion}. This criterion does not work because we do not consider the shape of the passband.

A second criterion considers the maximum rejection within the stopband minus the mean of the absolute value of passband rejection. With this criterion, the results are significantly improved as shown in figure~\ref{fig:custom_criterion}.

263

260

A second criterion considers the maximum rejection within the stopband minus the mean of the absolute value of passband rejection. With this criterion, the results are significantly improved as shown in figure~\ref{fig:custom_criterion}.

264

261

\begin{figure}

265

262

\begin{figure}

\centering

266

263

\centering

\includegraphics[width=\linewidth]{images/colored_mean_criterion}

267

264

\includegraphics[width=\linewidth]{images/colored_mean_criterion}

\caption{Mean criterion comparison between monolithic filter and cascade filters}

268

265

\caption{Mean criterion comparison between monolithic filter and cascade filters}

\label{fig:mean_criterion}

269

266

\label{fig:mean_criterion}

\end{figure}

270

267

\end{figure}

271

268

\begin{figure}

272

269

\begin{figure}

\centering

273

270

\centering

\includegraphics[width=\linewidth]{images/colored_custom_criterion}

274

271

\includegraphics[width=\linewidth]{images/colored_custom_criterion}

\caption{Custom criterion comparison between monolithic filter and cascade filters}

275

272

\caption{Custom criterion comparison between monolithic filter and cascade filters}

\label{fig:custom_criterion}

276

273

\label{fig:custom_criterion}

274

\end{figure}

275

276

Thanks to this criterion we are able to automatically generate lot of fir coefficients

277

and estimate their rejection. The figure~\ref{fig:rejection_pyramid} exhibits the

278

rejection in function of the number of coefficients and their number of bits.

279

We can observe it looks like a pyramid so the edge represents the best

280

coefficient set. Indeed if we choose a number of coefficients, increasing the number

281

of bits over the edge will not improve the rejection. Conversely when we choose

282

a number of bits, too much increase the number of coefficients will not improve

283

the rejection. Hence the best coefficient set are on the edge of pyramid.

284

285

\begin{figure}

286

\centering

287

\includegraphics[width=\linewidth]{images/rejection_pyramid}

288

\caption{Rejection as a function of number of coefficients and number of bits}

289

\label{fig:rejection_pyramid}

\end{figure}

277

290

\end{figure}

278

291

Although we have a efficient criterion to estimate the rejection of one set of coefficient

279

292

Although we have a efficient criterion to estimate the rejection of one set of coefficient

we have a problem when we sum two or more criterion. If the FIR filter coefficients are the same

280

293

we have a problem when we sum two or more criterion. If the FIR filter coefficients are the same

between the stage, we have:

281

294

between the stage, we have:

$$F_{total} = F_1 + F_2$$

282

295

$$F_{total} = F_1 + F_2$$

But when we choose two different set of coefficient, the previous equality are not

283

296

But when we choose two different set of coefficient, the previous equality are not

true. The figure~\ref{fig:sum_rejection} illustrates the problem. The red and blue curves

284

297

true. The figure~\ref{fig:sum_rejection} illustrates the problem. The red and blue curves

are two different filter coefficient and we can see that their maximum on the stopband

285

298

are two different filter coefficient and we can see that their maximum on the stopband

are not at the same frequency. So when we sum the rejection criteria (the dotted yellow line)

286

299

are not at the same frequency. So when we sum the rejection criteria (the dotted yellow line)

we do not meet the dashed yellow line. Define the rejection of cascaded filters

287

300

we do not meet the dashed yellow line. Define the rejection of cascaded filters

is more difficult than just take the summation between all the rejection criteria of each filter.

288

301

is more difficult than just take the summation between all the rejection criteria of each filter.

However this summation gives us an upper bound for rejection although in fact we obtain

289

302

However this summation gives us an upper bound for rejection although in fact we obtain

better rejection than expected.

290

303

better rejection than expected.

291

304

\begin{figure}

292

305

\begin{figure}

\centering

293

306

\centering

\includegraphics[width=\linewidth]{images/cascaded_criterion}

294

307

\includegraphics[width=\linewidth]{images/cascaded_criterion}

\caption{Rejection of two cascaded filters}

295

308

\caption{Rejection of two cascaded filters}

\label{fig:sum_rejection}

296

309

\label{fig:sum_rejection}

\end{figure}

297

310

\end{figure}

298

311

The first problem we address is to maximize the rejection under bounded silicon area

299

312

The first problem we address is to maximize the rejection under bounded silicon area

and feasibility constraints. Variable $a_i$ is the area taken by filter~$i$

300

313

and feasibility constraints. Variable $a_i$ is the area taken by filter~$i$

(in arbitrary unit). Variable $r_i$ is the rejection of filter~$i$ (in dB).

301

314

(in arbitrary unit). Variable $r_i$ is the rejection of filter~$i$ (in dB).

Constant $\mathcal{A}$ is the total available area. We model our problem as follows:

302

315

Constant $\mathcal{A}$ is the total available area. We model our problem as follows:

303

316

Finally we can describe our abstract model with following expressions :

304

317

Finally we can describe our abstract model with following expressions :

\begin{align}

305

318

\begin{align}

\text{Maximize } & \sum_{i=1}^n r_i \notag \\

306

319

\text{Maximize } & \sum_{i=1}^n r_i \notag \\

\sum_{i=1}^n a_i & \leq \mathcal{A} & \label{eq:area} \\

307

320

\sum_{i=1}^n a_i & \leq \mathcal{A} & \label{eq:area} \\

a_i & = C_i \times (\pi_i^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef} \\

308

321

a_i & = C_i \times (\pi_i^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef} \\

r_i & = F(C_i, \pi_i^C), & \forall i \in [1, n] \label{eq:rejectiondef} \\

309

322

r_i & = F(C_i, \pi_i^C), & \forall i \in [1, n] \label{eq:rejectiondef} \\

\pi_i^+ & = \pi_i^- + \pi_i^C - \pi_i^S, & \forall i \in [1, n] \label{eq:bits} \\

310

323

\pi_i^+ & = \pi_i^- + \pi_i^C - \pi_i^S, & \forall i \in [1, n] \label{eq:bits} \\

\pi_{i - 1}^+ & = \pi_i^-, & \forall i \in [2, n] \label{eq:inout} \\

311

324

\pi_{i - 1}^+ & = \pi_i^-, & \forall i \in [2, n] \label{eq:inout} \\

\pi_i^+ & \geq 1 + \sum_{k=1}^{i} \left(1 + \frac{r_j}{6}\right), & \forall i \in [1, n] \label{eq:maxshift} \\

312

325

\pi_i^+ & \geq 1 + \sum_{k=1}^{i} \left(1 + \frac{r_j}{6}\right), & \forall i \in [1, n] \label{eq:maxshift} \\

\pi_1^- &= \Pi^I \label{eq:init}

313

326

\pi_1^- &= \Pi^I \label{eq:init}

\end{align}

314

327

\end{align}

315

328

Equation~\ref{eq:area} states that the total area taken by the filters must be

316

329

Equation~\ref{eq:area} states that the total area taken by the filters must be

less than the available area. Equation~\ref{eq:areadef} gives the definition of

317

330

less than the available area. Equation~\ref{eq:areadef} gives the definition of

the area for a filter. More precisely, it is the area of the FIR as the Shifter

318

331

the area for a filter. More precisely, it is the area of the FIR as the Shifter

does not need any circuitry. We consider that the FIR needs $C_i$ registers of size

319

332

does not need any circuitry. We consider that the FIR needs $C_i$ registers of size

$\pi_i^C + \pi_i^-$~bits to store the results of the multiplications of the

320

333

$\pi_i^C + \pi_i^-$~bits to store the results of the multiplications of the

input data and the coefficients. Equation~\ref{eq:rejectiondef} gives the

321

334

input data and the coefficients. Equation~\ref{eq:rejectiondef} gives the

definition of the rejection of the filter thanks to function~$F$ that we defined

322

335

definition of the rejection of the filter thanks to function~$F$ that we defined

previously. The Shifter does not introduce negative rejection as we explain later,

323

336

previously. The Shifter does not introduce negative rejection as we explain later,

so the rejection only comes from the FIR. Equation~\ref{eq:bits} states the

324

337

so the rejection only comes from the FIR. Equation~\ref{eq:bits} states the

relation between $\pi_i^+$ and $\pi_i^-$. The multiplications in the FIR add

325

338

relation between $\pi_i^+$ and $\pi_i^-$. The multiplications in the FIR add

$\pi_i^C$ bits as most coefficients are close to zero, and the Shifter removes

326

339

$\pi_i^C$ bits as most coefficients are close to zero, and the Shifter removes

$\pi_i^S$ bits. Equation~\ref{eq:inout} states that the output number of bits of

327

340

$\pi_i^S$ bits. Equation~\ref{eq:inout} states that the output number of bits of

a filter is the same as the input number of bits of the next filter.

328

341

a filter is the same as the input number of bits of the next filter.

Equation~\ref{eq:maxshift} ensures that the Shifter does not introduce negative

329

342

Equation~\ref{eq:maxshift} ensures that the Shifter does not introduce negative

rejection. Indeed, the results of the FIR can be right shifted without compromising

330

343

rejection. Indeed, the results of the FIR can be right shifted without compromising

the quality of the rejection until a threshold. Each bit of the output data

331

344

the quality of the rejection until a threshold. Each bit of the output data

increases the maximum rejection level of 6~dB. We add one to take the sign bit

332

345

increases the maximum rejection level of 6~dB. We add one to take the sign bit

into account. If equation~\ref{eq:maxshift} was not present, the Shifter could

333

346

into account. If equation~\ref{eq:maxshift} was not present, the Shifter could

shift too much and introduce some noise in the output data. Each supplementary

334

347

shift too much and introduce some noise in the output data. Each supplementary

shift bit would cause 6~dB of noise. A totally equivalent equation is:

335

348

shift bit would cause 6~dB of noise. A totally equivalent equation is:

$\pi_i^S \leq \pi_i^- + \pi_i^C - 1 - \sum_{k=1}^{i} \left(1 + \frac{r_j}{6}\right) $.

336

349

$\pi_i^S \leq \pi_i^- + \pi_i^C - 1 - \sum_{k=1}^{i} \left(1 + \frac{r_j}{6}\right) $.

Finally, equation~\ref{eq:init} gives the global input's number of bits.

337

350

Finally, equation~\ref{eq:init} gives the global input's number of bits.

338

351

This model is non-linear and even non-quadratic, as $F$ does not have a known

339

352

This model is non-linear and even non-quadratic, as $F$ does not have a known

linear or quadratic expression. We introduce $p$ FIR configurations

340

353

linear or quadratic expression. We introduce $p$ FIR configurations

$(C_{ij}, \pi_{ij}^C), 1 \leq j \leq p$ that are constants. We define binary

341

354

$(C_{ij}, \pi_{ij}^C), 1 \leq j \leq p$ that are constants. We define binary

variable $\delta_{ij}$ that has value 1 if stage~$i$ is in configuration~$j$

342

355

variable $\delta_{ij}$ that has value 1 if stage~$i$ is in configuration~$j$

and 0 otherwise. The new equations are as follows:

343

356

and 0 otherwise. The new equations are as follows:

344

357

\begin{align}

345

358

\begin{align}

a_i & = \sum_{j=1}^p \delta_{ij} \times C_{ij} \times (\pi_{ij}^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef2} \\

346

359

a_i & = \sum_{j=1}^p \delta_{ij} \times C_{ij} \times (\pi_{ij}^C + \pi_i^-), & \forall i \in [1, n] \label{eq:areadef2} \\

r_i & = \sum_{j=1}^p \delta_{ij} \times F(C_{ij}, \pi_{ij}^C), & \forall i \in [1, n] \label{eq:rejectiondef2} \\

347

360

r_i & = \sum_{j=1}^p \delta_{ij} \times F(C_{ij}, \pi_{ij}^C), & \forall i \in [1, n] \label{eq:rejectiondef2} \\

\pi_i^+ & = \pi_i^- + \left(\sum_{j=1}^p \delta_{ij} \pi_{ij}^C\right) - \pi_i^S, & \forall i \in [1, n] \label{eq:bits2} \\

348

361

\pi_i^+ & = \pi_i^- + \left(\sum_{j=1}^p \delta_{ij} \pi_{ij}^C\right) - \pi_i^S, & \forall i \in [1, n] \label{eq:bits2} \\

\sum_{j=1}^p \delta_{ij} & \leq 1, & \forall i \in [1, n] \label{eq:config}

349

362

\sum_{j=1}^p \delta_{ij} & \leq 1, & \forall i \in [1, n] \label{eq:config}

\end{align}

350

363

\end{align}

351

364

Equations \ref{eq:areadef2}, \ref{eq:rejectiondef2} and \ref{eq:bits2} replace

352

365

Equations \ref{eq:areadef2}, \ref{eq:rejectiondef2} and \ref{eq:bits2} replace

respectively equations \ref{eq:areadef}, \ref{eq:rejectiondef} and \ref{eq:bits}.

353

366

respectively equations \ref{eq:areadef}, \ref{eq:rejectiondef} and \ref{eq:bits}.

Equation~\ref{eq:config} states that for each stage, a single configuration is chosen at most.

354

367

Equation~\ref{eq:config} states that for each stage, a single configuration is chosen at most.

355

368

This modified model is quadratic, and it can be linearised if necessary. The Gurobi

356

369

This modified model is quadratic, and it can be linearised if necessary. The Gurobi

(\url{www.gurobi.com}) optimization software is used to solve this quadratic

357

370

(\url{www.gurobi.com}) optimization software is used to solve this quadratic

model, and since Gurobi is able to linearize, the model is left as is. This model

358

371

model, and since Gurobi is able to linearize, the model is left as is. This model

has $O(np)$ variables and $O(n)$ constraints.

359

372

has $O(np)$ variables and $O(n)$ constraints.

360

373

The section~\ref{sec:fixed_area} shows the results for the first version of quadratic program but the section~\ref{sec:fixed_rej}

361

374

The section~\ref{sec:fixed_area} shows the results for the first version of quadratic program but the section~\ref{sec:fixed_rej}

presents the results for the complementary problem. In this case we want

362

375

presents the results for the complementary problem. In this case we want

minimize the occupied area for a targeted rejection level. Hence we have replace

363

376

minimize the occupied area for a targeted rejection level. Hence we have replace

the objective function with:

364

377

the objective function with:

\begin{align}

365

378

\begin{align}

\text{Minimize } & \sum_{i=1}^n a_i \notag

366

379

\text{Minimize } & \sum_{i=1}^n a_i \notag

\end{align}

367

380

\end{align}

We adapt our constraints of quadratic program to replace the equation \ref{eq:area}

368

381

We adapt our constraints of quadratic program to replace the equation \ref{eq:area}

by the equation \ref{eq:rejection_min} where $\mathcal{R}$ is the minimal

369

382

by the equation \ref{eq:rejection_min} where $\mathcal{R}$ is the minimal

rejection required.

370

383

rejection required.

371

384

\begin{align}

372

385

\begin{align}

\sum_{i=1}^n r_i & \geq \mathcal{R} & \label{eq:rejection_min}

373

386

\sum_{i=1}^n r_i & \geq \mathcal{R} & \label{eq:rejection_min}

\end{align}

374

387

\end{align}

375

388

\section{Design workflow}

376

389

\section{Design workflow}

\label{sec:workflow}

377

390

\label{sec:workflow}

378

391

In this section, we describe the workflow to compute all the results presented in section~\ref{sec:fixed_area}.

379

392

In this section, we describe the workflow to compute all the results presented in section~\ref{sec:fixed_area}.

Figure~\ref{fig:workflow} shows the global workflow and the different steps involved in the computations of the results.

380

393

Figure~\ref{fig:workflow} shows the global workflow and the different steps involved in the computations of the results.

381

394

\begin{figure}

382

395

\begin{figure}

\centering

383

396

\centering

\begin{tikzpicture}[node distance=0.75cm and 2cm]

384

397

\begin{tikzpicture}[node distance=0.75cm and 2cm]

\node[draw,minimum size=1cm] (Solver) { Filter Solver } ;

385

398

\node[draw,minimum size=1cm] (Solver) { Filter Solver } ;

\node (Start) [left= 3cm of Solver] { } ;

386

399

\node (Start) [left= 3cm of Solver] { } ;

\node[draw,minimum size=1cm] (TCL) [right= of Solver] { TCL Script } ;

387

400

\node[draw,minimum size=1cm] (TCL) [right= of Solver] { TCL Script } ;

\node (Input) [above= of TCL] { } ;

388

401

\node (Input) [above= of TCL] { } ;

\node[draw,minimum size=1cm] (Deploy) [below= of Solver] { Deploy Script } ;

389

402

\node[draw,minimum size=1cm] (Deploy) [below= of Solver] { Deploy Script } ;

\node[draw,minimum size=1cm] (Bitstream) [below= of TCL] { Bitstream } ;

390

403

\node[draw,minimum size=1cm] (Bitstream) [below= of TCL] { Bitstream } ;

\node[draw,minimum size=1cm,rounded corners] (Board) [below right= of Deploy] { Board } ;

391

404

\node[draw,minimum size=1cm,rounded corners] (Board) [below right= of Deploy] { Board } ;

\node[draw,minimum size=1cm] (Postproc) [below= of Deploy] { Post-Processing } ;

392

405

\node[draw,minimum size=1cm] (Postproc) [below= of Deploy] { Post-Processing } ;

\node (Results) [left= of Postproc] { } ;

393

406

\node (Results) [left= of Postproc] { } ;

394

407

\draw[->] (Start) edge node [above] { $\mathcal{A}, n, \Pi^I$ } node [below] { $(C_{ij}, \pi_{ij}^C), F$ } (Solver) ;

395

408

\draw[->] (Start) edge node [above] { $\mathcal{A}, n, \Pi^I$ } node [below] { $(C_{ij}, \pi_{ij}^C), F$ } (Solver) ;

\draw[->] (Input) edge node [left] { ADC or PRN } (TCL) ;

396

409

\draw[->] (Input) edge node [left] { ADC or PRN } (TCL) ;

\draw[->] (Solver) edge node [below] { (1a) } (TCL) ;

397

410

\draw[->] (Solver) edge node [below] { (1a) } (TCL) ;

\draw[->] (Solver) edge node [right] { (1b) } (Deploy) ;

398

411

\draw[->] (Solver) edge node [right] { (1b) } (Deploy) ;

\draw[->] (TCL) edge node [left] { (2) } (Bitstream) ;

399

412

\draw[->] (TCL) edge node [left] { (2) } (Bitstream) ;

\draw[->,dashed] (Bitstream) -- (Deploy) ;

400

413

\draw[->,dashed] (Bitstream) -- (Deploy) ;

\draw[->] (Deploy) to[out=-30,in=120] node [above] { (3) } (Board) ;

401

414

\draw[->] (Deploy) to[out=-30,in=120] node [above] { (3) } (Board) ;

\draw[->] (Board) to[out=150,in=-60] node [below] { (4) } (Deploy) ;

402

415

\draw[->] (Board) to[out=150,in=-60] node [below] { (4) } (Deploy) ;

\draw[->] (Deploy) edge node [left] { (5) } (Postproc) ;

403

416

\draw[->] (Deploy) edge node [left] { (5) } (Postproc) ;

\draw[->] (Postproc) -- (Results) ;

404

417

\draw[->] (Postproc) -- (Results) ;

\end{tikzpicture}

405

418

\end{tikzpicture}

\caption{Design workflow from the input parameters to the results}

406

419

\caption{Design workflow from the input parameters to the results}

\label{fig:workflow}

407

420

\label{fig:workflow}

\end{figure}

408

421

\end{figure}

409

422

The filter solver is a C++ program that takes as input the maximum area

410

423

The filter solver is a C++ program that takes as input the maximum area

$\mathcal{A}$, the number of stages $n$, the size of the input signal $\Pi^I$,

411

424

$\mathcal{A}$, the number of stages $n$, the size of the input signal $\Pi^I$,

the FIR configurations $(C_{ij}, \pi_{ij}^C)$ and the function $F$. It creates

412

425

the FIR configurations $(C_{ij}, \pi_{ij}^C)$ and the function $F$. It creates

the quadratic programs and uses the Gurobi solver to get the optimal results.

413

426

the quadratic programs and uses the Gurobi solver to get the optimal results.

Then it produces two scripts: a TCL script ((1a) on figure~\ref{fig:workflow})

414

427

Then it produces two scripts: a TCL script ((1a) on figure~\ref{fig:workflow})

and a deploy script ((1b) on figure~\ref{fig:workflow}).

415

428

and a deploy script ((1b) on figure~\ref{fig:workflow}).

416

429

The TCL script describes the whole digital processing chain from the beginning

417

430

The TCL script describes the whole digital processing chain from the beginning

(the raw signal data) to the end (the filtered data).

418

431

(the raw signal data) to the end (the filtered data).

The raw input data generated from a Pseudo Random Number (PRN)

419

432

The raw input data generated from a Pseudo Random Number (PRN)

generator inside the FPGA and $\Pi^I$ is fixed at 16~bits.

420

433

generator inside the FPGA and $\Pi^I$ is fixed at 16~bits.

Then the script builds each stage of the chain with a generic FIR task that

421

434

Then the script builds each stage of the chain with a generic FIR task that

comes from a skeleton library. The generic FIR is highly configurable

422

435

comes from a skeleton library. The generic FIR is highly configurable

with the number of coefficients and the size of the coefficients. The coefficients

423

436

with the number of coefficients and the size of the coefficients. The coefficients

themselves are not stored in the script.

424

437

themselves are not stored in the script.

Whereas the signal is processed in real-time, the output signal is stored as

425

438

Whereas the signal is processed in real-time, the output signal is stored as

consecutive bursts of data.

426

439

consecutive bursts of data.

427

440

The TCL script is used by Vivado to produce the FPGA bitstream ((2) on figure~\ref{fig:workflow}).

428

441

The TCL script is used by Vivado to produce the FPGA bitstream ((2) on figure~\ref{fig:workflow}).

We use the 2018.2 version of Xilinx Vivado and we execute the synthesized

429

442

We use the 2018.2 version of Xilinx Vivado and we execute the synthesized

bitstream on a Redpitaya board fitted with a Xilinx Zynq-7010 series

430

443

bitstream on a Redpitaya board fitted with a Xilinx Zynq-7010 series

FPGA (xc7z010clg400-1) and two 125~MS/s ADC.

431

444

FPGA (xc7z010clg400-1) and two 125~MS/s ADC.

The board works with a Buildroot Linux image. We have developed some tools and

432

445

The board works with a Buildroot Linux image. We have developed some tools and

drivers to flash and communicate with the FPGA. They are used to automatize all

433

446

drivers to flash and communicate with the FPGA. They are used to automatize all

the workflow inside the board: load the filter coefficients and retrieve the

434

447

the workflow inside the board: load the filter coefficients and retrieve the

computed data.

435

448

computed data.

436

449

The deploy script uploads the bitstream to the board ((3) on

437

450

The deploy script uploads the bitstream to the board ((3) on

figure~\ref{fig:workflow}), flashes the FPGA, loads the different drivers,

438

451

figure~\ref{fig:workflow}), flashes the FPGA, loads the different drivers,

configures the coefficients of the FIR filters. It then waits for the results

439

452

configures the coefficients of the FIR filters. It then waits for the results

and retrieves the data to the main computer ((4) on figure~\ref{fig:workflow}).

440

453

and retrieves the data to the main computer ((4) on figure~\ref{fig:workflow}).

441

454

Finally, an Octave post-processing script computes the final results thanks to

442

455

Finally, an Octave post-processing script computes the final results thanks to

the output data ((5) on figure~\ref{fig:workflow}).

443

456

the output data ((5) on figure~\ref{fig:workflow}).

The results are normalized so that the Power Spectrum Density (PSD) starts at zero

444

457

The results are normalized so that the Power Spectrum Density (PSD) starts at zero

and the different configurations can be compared.

445

458

and the different configurations can be compared.

446

459

The workflow used to compute the results in section~\ref{sec:fixed_rej}, we

447

460

The workflow used to compute the results in section~\ref{sec:fixed_rej}, we

have just adapted the quadratic program but the rest of the workflow is unchanged.

448

461

have just adapted the quadratic program but the rest of the workflow is unchanged.

449

462

\section{Experiments with fixed area space}

450

463

\section{Experiments with fixed area space}

\label{sec:fixed_area}

451

464

\label{sec:fixed_area}

This section presents the output of the filter solver {\em i.e.} the computed

452

465

This section presents the output of the filter solver {\em i.e.} the computed

configurations for each stage, the computed rejection and the computed silicon area.

453

466

configurations for each stage, the computed rejection and the computed silicon area.

This is interesting to understand the choices made by the solver to compute its solutions.

454

467

This is interesting to understand the choices made by the solver to compute its solutions.

455

468

The experimental setup is composed of three cases. The raw input is generated

456

469

The experimental setup is composed of three cases. The raw input is generated

by a Pseudo Random Number (PRN) generator, which fixes the input data size $\Pi^I$.

457

470

by a Pseudo Random Number (PRN) generator, which fixes the input data size $\Pi^I$.

Then the total silicon area $\mathcal{A}$ has been fixed to either 500, 1000 or 1500

458

471

Then the total silicon area $\mathcal{A}$ has been fixed to either 500, 1000 or 1500

arbitrary units. Hence, the three cases have been named: MAX/500, MAX/1000, MAX/1500.

459

472

arbitrary units. Hence, the three cases have been named: MAX/500, MAX/1000, MAX/1500.

The number of configurations $p$ is 1827, with $C_i$ ranging from 3 to 60 and $\pi^C$

460

473

The number of configurations $p$ is 1827, with $C_i$ ranging from 3 to 60 and $\pi^C$

ranging from 2 to 22. In each case, the quadratic program has been able to give a

461

474

ranging from 2 to 22. In each case, the quadratic program has been able to give a

result up to five stages ($n = 5$) in the cascaded filter.

462

475

result up to five stages ($n = 5$) in the cascaded filter.

463

476

Table~\ref{tbl:gurobi_max_500} shows the results obtained by the filter solver for MAX/500.

464

477

Table~\ref{tbl:gurobi_max_500} shows the results obtained by the filter solver for MAX/500.

Table~\ref{tbl:gurobi_max_1000} shows the results obtained by the filter solver for MAX/1000.

465

478

Table~\ref{tbl:gurobi_max_1000} shows the results obtained by the filter solver for MAX/1000.

Table~\ref{tbl:gurobi_max_1500} shows the results obtained by the filter solver for MAX/1500.

466

479

Table~\ref{tbl:gurobi_max_1500} shows the results obtained by the filter solver for MAX/1500.

467

480

\renewcommand{\arraystretch}{1.4}

468

481

\renewcommand{\arraystretch}{1.4}

469

482

\begin{table}

470

483

\begin{table}

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/500}

471

484

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/500}

\label{tbl:gurobi_max_500}

472

485

\label{tbl:gurobi_max_500}

\centering

473

486

\centering

{\scalefont{0.77}

474

487

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

475

488

\begin{tabular}{|c|ccccc|c|c|}

\hline

476

489

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

477

490

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

478

491

\hline

1 & (21, 7, 0) & - & - & - & - & 32~dB & 483 \\

479

492

1 & (21, 7, 0) & - & - & - & - & 32~dB & 483 \\

2 & (3, 3, 15) & (31, 9, 0) & - & - & - & 58~dB & 460 \\

480

493

2 & (3, 3, 15) & (31, 9, 0) & - & - & - & 58~dB & 460 \\

3 & (3, 3, 15) & (27, 9, 0) & (5, 3, 0) & - & - & 66~dB & 488 \\

481

494

3 & (3, 3, 15) & (27, 9, 0) & (5, 3, 0) & - & - & 66~dB & 488 \\

4 & (3, 3, 15) & (19, 7, 0) & (11, 5, 0) & (3, 3, 0) & - & 74~dB & 499 \\

482

495

4 & (3, 3, 15) & (19, 7, 0) & (11, 5, 0) & (3, 3, 0) & - & 74~dB & 499 \\

5 & (3, 3, 15) & (23, 8, 0) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & 78~dB & 489 \\

483

496

5 & (3, 3, 15) & (23, 8, 0) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & 78~dB & 489 \\

\hline

484

497

\hline

\end{tabular}

485

498

\end{tabular}

}

486

499

}

\end{table}

487

500

\end{table}

488

501

\begin{table}

489

502

\begin{table}

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/1000}

490

503

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/1000}

\label{tbl:gurobi_max_1000}

491

504

\label{tbl:gurobi_max_1000}

\centering

492

505

\centering

{\scalefont{0.77}

493

506

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

494

507

\begin{tabular}{|c|ccccc|c|c|}

\hline

495

508

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

496

509

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

497

510

\hline

1 & (37, 11, 0) & - & - & - & - & 56~dB & 999 \\

498

511

1 & (37, 11, 0) & - & - & - & - & 56~dB & 999 \\

2 & (3, 3, 15) & (51, 14, 0) & - & - & - & 87~dB & 975 \\

499

512

2 & (3, 3, 15) & (51, 14, 0) & - & - & - & 87~dB & 975 \\

3 & (3, 3, 15) & (35, 11, 0) & (19, 7, 0) & - & - & 99~dB & 1000 \\

500

513

3 & (3, 3, 15) & (35, 11, 0) & (19, 7, 0) & - & - & 99~dB & 1000 \\

4 & (3, 4, 16) & (27, 8, 0) & (19, 7, 1) & (11, 5, 0) & - & 103~dB & 998 \\

501

514

4 & (3, 4, 16) & (27, 8, 0) & (19, 7, 1) & (11, 5, 0) & - & 103~dB & 998 \\

5 & (3, 3, 15) & (31, 9, 0) & (19, 7, 0) & (3, 3, 1) & (3, 3, 0) & 111~dB & 984 \\

502

515

5 & (3, 3, 15) & (31, 9, 0) & (19, 7, 0) & (3, 3, 1) & (3, 3, 0) & 111~dB & 984 \\

\hline

503

516

\hline

\end{tabular}

504

517

\end{tabular}

}

505

518

}

\end{table}

506

519

\end{table}

507

520

\begin{table}

508

521

\begin{table}

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/1500}

509

522

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MAX/1500}

\label{tbl:gurobi_max_1500}

510

523

\label{tbl:gurobi_max_1500}

\centering

511

524

\centering

{\scalefont{0.77}

512

525

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

513

526

\begin{tabular}{|c|ccccc|c|c|}

\hline

514

527

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

515

528

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

516

529

\hline

1 & (47, 15, 0) & - & - & - & - & 71~dB & 1457 \\

517

530

1 & (47, 15, 0) & - & - & - & - & 71~dB & 1457 \\

2 & (19, 6, 15) & (51, 14, 0) & - & - & - & 103~dB & 1489 \\

518

531

2 & (19, 6, 15) & (51, 14, 0) & - & - & - & 103~dB & 1489 \\

3 & (3, 3, 15) & (35, 11, 0) & (35, 11, 0) & - & - & 122~dB & 1492 \\

519

532

3 & (3, 3, 15) & (35, 11, 0) & (35, 11, 0) & - & - & 122~dB & 1492 \\

4 & (3, 3, 15) & (27, 8, 0) & (19, 7, 0) & (27, 9, 0) & - & 129~dB & 1498 \\

520

533

4 & (3, 3, 15) & (27, 8, 0) & (19, 7, 0) & (27, 9, 0) & - & 129~dB & 1498 \\

5 & (3, 3, 15) & (23, 9, 2) & (27, 9, 0) & (19, 7, 0) & (3, 3, 0) & 136~dB & 1499 \\

521

534

5 & (3, 3, 15) & (23, 9, 2) & (27, 9, 0) & (19, 7, 0) & (3, 3, 0) & 136~dB & 1499 \\

\hline

522

535

\hline

\end{tabular}

523

536

\end{tabular}

}

524

537

}

\end{table}

525

538

\end{table}

526

539

\renewcommand{\arraystretch}{1}

527

540

\renewcommand{\arraystretch}{1}

528

541

From these tables, we can first state that the more stages are used to define

529

542

From these tables, we can first state that the more stages are used to define

the cascaded FIR filters, the better the rejection. It was an expected result as it has

530

543

the cascaded FIR filters, the better the rejection. It was an expected result as it has

been previously observed that many small filters are better than

531

544

been previously observed that many small filters are better than

a single large filter \cite{lim_1988, lim_1996, young_1992}, despite such conclusion

532

545

a single large filter \cite{lim_1988, lim_1996, young_1992}, despite such conclusion

being hardly used in practice due to the lack of tools for identifying individual filter

533

546

being hardly used in practice due to the lack of tools for identifying individual filter

coefficients in the cascaded approach.

534

547

coefficients in the cascaded approach.

535

548

Second, the larger the silicon area, the better the rejection. This was also an

536

549

Second, the larger the silicon area, the better the rejection. This was also an

expected result as more area means a filter of better quality (more coefficients

537

550

expected result as more area means a filter of better quality (more coefficients

or more bits per coefficient).

538

551

or more bits per coefficient).

539

552

Then, we also observe that the first stage can have a larger shift than the other

540

553

Then, we also observe that the first stage can have a larger shift than the other

stages. This is explained by the fact that the solver tries to use just enough

541

554

stages. This is explained by the fact that the solver tries to use just enough

bits for the computed rejection after each stage. In the first stage, a

542

555

bits for the computed rejection after each stage. In the first stage, a

balance between a strong rejection with a low number of bits is targeted. Equation~\ref{eq:maxshift}

543

556

balance between a strong rejection with a low number of bits is targeted. Equation~\ref{eq:maxshift}

gives the relation between both values.

544

557

gives the relation between both values.

545

558

Finally, we note that the solver consumes all the given silicon area.

546

559

Finally, we note that the solver consumes all the given silicon area.

547

560

The following graphs present the rejection for real data on the FPGA. In all following

548

561

The following graphs present the rejection for real data on the FPGA. In all following

figures, the solid line represents the actual rejection of the filtered

549

562

figures, the solid line represents the actual rejection of the filtered

data on the FPGA as measured experimentally and the dashed line are the noise level

550

563

data on the FPGA as measured experimentally and the dashed line are the noise level

given by the quadratic solver. The configurations are those computed in the previous section.

551

564

given by the quadratic solver. The configurations are those computed in the previous section.

552

565

Figure~\ref{fig:max_500_result} shows the rejection of the different configurations in the case of MAX/500.

553

566

Figure~\ref{fig:max_500_result} shows the rejection of the different configurations in the case of MAX/500.

Figure~\ref{fig:max_1000_result} shows the rejection of the different configurations in the case of MAX/1000.

554

567

Figure~\ref{fig:max_1000_result} shows the rejection of the different configurations in the case of MAX/1000.

Figure~\ref{fig:max_1500_result} shows the rejection of the different configurations in the case of MAX/1500.

555

568

Figure~\ref{fig:max_1500_result} shows the rejection of the different configurations in the case of MAX/1500.

556

569

\begin{figure}

557

570

\begin{figure}

\centering

558

571

\centering

\includegraphics[width=\linewidth]{images/max_500}

559

572

\includegraphics[width=\linewidth]{images/max_500}

\caption{Signal spectrum for MAX/500}

560

573

\caption{Signal spectrum for MAX/500}

\label{fig:max_500_result}

561

574

\label{fig:max_500_result}

\end{figure}

562

575

\end{figure}

563

576

\begin{figure}

564

577

\begin{figure}

\centering

565

578

\centering

\includegraphics[width=\linewidth]{images/max_1000}

566

579

\includegraphics[width=\linewidth]{images/max_1000}

\caption{Signal spectrum for MAX/1000}

567

580

\caption{Signal spectrum for MAX/1000}

\label{fig:max_1000_result}

568

581

\label{fig:max_1000_result}

\end{figure}

569

582

\end{figure}

570

583

\begin{figure}

571

584

\begin{figure}

\centering

572

585

\centering

\includegraphics[width=\linewidth]{images/max_1500}

573

586

\includegraphics[width=\linewidth]{images/max_1500}

\caption{Signal spectrum for MAX/1500}

574

587

\caption{Signal spectrum for MAX/1500}

\label{fig:max_1500_result}

575

588

\label{fig:max_1500_result}

\end{figure}

576

589

\end{figure}

577

590

In all cases, we observe that the actual rejection is close to the rejection computed by the solver.

578

591

In all cases, we observe that the actual rejection is close to the rejection computed by the solver.

579

592

We compare the actual silicon resources given by Vivado to the

580

593

We compare the actual silicon resources given by Vivado to the

resources in arbitrary units.

581

594

resources in arbitrary units.

The goal is to check that our arbitrary units of silicon area models well enough

582

595

The goal is to check that our arbitrary units of silicon area models well enough

the real resources on the FPGA. Especially we want to verify that, for a given

583

596

the real resources on the FPGA. Especially we want to verify that, for a given

number of arbitrary units, the actual silicon resources do not depend on the

584

597

number of arbitrary units, the actual silicon resources do not depend on the

number of stages $n$. Most significantly, our approach aims

585

598

number of stages $n$. Most significantly, our approach aims

at remaining far enough from the practical logic gate implementation used by

586

599

at remaining far enough from the practical logic gate implementation used by

various vendors to remain platform independent and be portable from one

587

600

various vendors to remain platform independent and be portable from one

architecture to another.

588

601

architecture to another.

589

602

Table~\ref{tbl:resources_usage} shows the resources usage in the case of MAX/500, MAX/1000 and

590

603

Table~\ref{tbl:resources_usage} shows the resources usage in the case of MAX/500, MAX/1000 and

MAX/1500 \emph{i.e.} when the maximum allowed silicon area is fixed to 500, 1000

591

604

MAX/1500 \emph{i.e.} when the maximum allowed silicon area is fixed to 500, 1000

and 1500 arbitrary units. We have taken care to extract solely the resources used by

592

605

and 1500 arbitrary units. We have taken care to extract solely the resources used by

the FIR filters and remove additional processing blocks including FIFO and PL to

593

606

the FIR filters and remove additional processing blocks including FIFO and PL to

PS communication.

594

607

PS communication.

595

608

\begin{table}

596

609

\begin{table}

\caption{Resource occupation. The last column refers to available resources on a Zynq-7010 as found on the Redpitaya.}

597

610

\caption{Resource occupation. The last column refers to available resources on a Zynq-7010 as found on the Redpitaya.}

\label{tbl:resources_usage}

598

611

\label{tbl:resources_usage}

\centering

599

612

\centering

\begin{tabular}{|c|c|ccc|c|}

600

613

\begin{tabular}{|c|c|ccc|c|}

\hline

601

614

\hline

$n$ & & MAX/500 & MAX/1000 & MAX/1500 & \emph{Zynq 7010} \\ \hline\hline

602

615

$n$ & & MAX/500 & MAX/1000 & MAX/1500 & \emph{Zynq 7010} \\ \hline\hline

& LUT & 249 & 453 & 627 & \emph{17600} \\

603

616

& LUT & 249 & 453 & 627 & \emph{17600} \\

1 & BRAM & 1 & 1 & 1 & \emph{120} \\

604

617

1 & BRAM & 1 & 1 & 1 & \emph{120} \\

& DSP & 21 & 37 & 47 & \emph{80} \\ \hline

605

618

& DSP & 21 & 37 & 47 & \emph{80} \\ \hline

& LUT & 2374 & 5494 & 691 & \emph{17600} \\

606

619

& LUT & 2374 & 5494 & 691 & \emph{17600} \\

2 & BRAM & 2 & 2 & 2 & \emph{120} \\

607

620

2 & BRAM & 2 & 2 & 2 & \emph{120} \\

& DSP & 0 & 0 & 70 & \emph{80} \\ \hline

608

621

& DSP & 0 & 0 & 70 & \emph{80} \\ \hline

& LUT & 2443 & 3304 & 3521 & \emph{17600} \\

609

622

& LUT & 2443 & 3304 & 3521 & \emph{17600} \\

3 & BRAM & 3 & 3 & 3 & \emph{120} \\

610

623

3 & BRAM & 3 & 3 & 3 & \emph{120} \\

& DSP & 0 & 19 & 35 & \emph{80} \\ \hline

611

624

& DSP & 0 & 19 & 35 & \emph{80} \\ \hline

& LUT & 2634 & 3753 & 2557 & \emph{17600} \\

612

625

& LUT & 2634 & 3753 & 2557 & \emph{17600} \\

4 & BRAM & 4 & 4 & 4 & \emph{120} \\

613

626

4 & BRAM & 4 & 4 & 4 & \emph{120} \\

& DPS & 0 & 19 & 46 & \emph{80} \\ \hline

614

627

& DPS & 0 & 19 & 46 & \emph{80} \\ \hline

& LUT & 2423 & 3047 & 2847 & \emph{17600} \\

615

628

& LUT & 2423 & 3047 & 2847 & \emph{17600} \\

5 & BRAM & 5 & 5 & 5 & \emph{120} \\

616

629

5 & BRAM & 5 & 5 & 5 & \emph{120} \\

& DPS & 0 & 22 & 46 & \emph{80} \\ \hline

617

630

& DPS & 0 & 22 & 46 & \emph{80} \\ \hline

\end{tabular}

618

631

\end{tabular}

\end{table}

619

632

\end{table}

620

633

In some cases, Vivado replaces the DSPs by Look Up Tables (LUTs). We assume that,

621

634

In some cases, Vivado replaces the DSPs by Look Up Tables (LUTs). We assume that,

when the filters coefficients are small enough, or when the input size is small

622

635

when the filters coefficients are small enough, or when the input size is small

enough, Vivado optimized resource consumption by selecting multiplexers to

623

636

enough, Vivado optimized resource consumption by selecting multiplexers to

implement the multiplications instead of a DSP. In this case, it is quite difficult

624

637

implement the multiplications instead of a DSP. In this case, it is quite difficult

to compare the whole silicon budget.

625

638

to compare the whole silicon budget.

626

639

However, a rough estimation can be made with a simple equivalence. Looking at

627

640

However, a rough estimation can be made with a simple equivalence. Looking at

the first column (MAX/500), where the number of LUTs is quite stable for $n \geq 2$,

628

641

the first column (MAX/500), where the number of LUTs is quite stable for $n \geq 2$,

we can deduce that a DSP is roughly equivalent to 100~LUTs in terms of silicon

629

642

we can deduce that a DSP is roughly equivalent to 100~LUTs in terms of silicon

area use. With this equivalence, our 500 arbitraty units corresponds to 2500 LUTs,

630

643

area use. With this equivalence, our 500 arbitraty units corresponds to 2500 LUTs,

1000 arbitrary units corresponds to 5000 LUTs and 1500 arbitrary units corresponds

631

644

1000 arbitrary units corresponds to 5000 LUTs and 1500 arbitrary units corresponds

to 7300 LUTs. The conclusion is that the orders of magnitude of our arbitrary

632

645

to 7300 LUTs. The conclusion is that the orders of magnitude of our arbitrary

unit are quite good. The relatively small differences can probably be explained

633

646

unit are quite good. The relatively small differences can probably be explained

by the optimizations done by Vivado based on the detailed map of available processing resources.

634

647

by the optimizations done by Vivado based on the detailed map of available processing resources.

635

648

We present the computation time to solve the quadratic problem.

636

649

We present the computation time to solve the quadratic problem.

For each case, the filter solver software are executed with a Intel(R) Xeon(R) CPU E5606

637

650

For each case, the filter solver software are executed with a Intel(R) Xeon(R) CPU E5606

cadenced at 2.13~GHz. The CPU has 8 cores that are used by Gurobi to solve

638

651

cadenced at 2.13~GHz. The CPU has 8 cores that are used by Gurobi to solve

the quadratic problem.

639

652

the quadratic problem.

640

653

Table~\ref{tbl:area_time} shows the time needed to solve the quadratic

641

654

Table~\ref{tbl:area_time} shows the time needed to solve the quadratic

problem when the maximal area is fixed to 500, 1000 and 1500 arbitrary units.

642

655

problem when the maximal area is fixed to 500, 1000 and 1500 arbitrary units.

643

656

\begin{table}

644

657

\begin{table}

\caption{Time to solve the quadratic program with Gurobi}

645

658

\caption{Time to solve the quadratic program with Gurobi}

\label{tbl:area_time}

646

659

\label{tbl:area_time}

\centering

647

660

\centering

\begin{tabular}{|c|c|c|c|}\hline

648

661

\begin{tabular}{|c|c|c|c|}\hline

$n$ & Time (MAX/500) & Time (MAX/1000) & Time (MAX/1500) \\\hline\hline

649

662

$n$ & Time (MAX/500) & Time (MAX/1000) & Time (MAX/1500) \\\hline\hline

1 & 0.1~s & 0.1~s & 0.3~s \\

650

663

1 & 0.1~s & 0.1~s & 0.3~s \\

2 & 1.1~s & 2.2~s & 12~s \\

651

664

2 & 1.1~s & 2.2~s & 12~s \\

3 & 17~s & 137~s ($\approx$ 2~min) & 275~s ($\approx$ 4~min) \\

652

665

3 & 17~s & 137~s ($\approx$ 2~min) & 275~s ($\approx$ 4~min) \\

4 & 52~s & 5448~s ($\approx$ 90~min) & 5505~s ($\approx$ 17~h) \\

653

666

4 & 52~s & 5448~s ($\approx$ 90~min) & 5505~s ($\approx$ 17~h) \\

5 & 286~s ($\approx$ 4~min) & 4119~s ($\approx$ 68~min) & 235479~s ($\approx$ 3~days) \\\hline

654

667

5 & 286~s ($\approx$ 4~min) & 4119~s ($\approx$ 68~min) & 235479~s ($\approx$ 3~days) \\\hline

\end{tabular}

655

668

\end{tabular}

\end{table}

656

669

\end{table}

657

670

As expected, the computation time seems to rise exponentially with the number of stages. % TODO: exponentiel ?

658

671

As expected, the computation time seems to rise exponentially with the number of stages. % TODO: exponentiel ?

When the area is limited, the design exploration space is more limited and the solver is able to

659

672

When the area is limited, the design exploration space is more limited and the solver is able to

find an optimal solution faster. On the contrary, in the case of MAX/1500 with

660

673

find an optimal solution faster. On the contrary, in the case of MAX/1500 with

5~stages, we were not able to obtain a result after 40~hours of computation so we decided to stop.

661

674

5~stages, we were not able to obtain a result after 40~hours of computation so we decided to stop.

662

675

\section{Experiments with fixed rejection target}

663

676

\section{Experiments with fixed rejection target}

\label{sec:fixed_rej}

664

677

\label{sec:fixed_rej}

This section presents the results of complementary quadratic program which we

665

678

This section presents the results of complementary quadratic program which we

minimize the area occupation for a targeted noise level.

666

679

minimize the area occupation for a targeted noise level.

667

680

The experimental setup is also composed of three cases. The raw input is the same

668

681

The experimental setup is also composed of three cases. The raw input is the same

as previous section, a PRN generator, which fixes the input data size $\Pi^I$.

669

682

as previous section, a PRN generator, which fixes the input data size $\Pi^I$.

Then the targeted rejection $\mathcal{R}$ has been fixed to either 40, 60 or 80~dB.

670

683

Then the targeted rejection $\mathcal{R}$ has been fixed to either 40, 60 or 80~dB.

Hence, the three cases have been named: MIN/40, MIN/60, MIN/80.

671

684

Hence, the three cases have been named: MIN/40, MIN/60, MIN/80.

The number of configurations $p$ is the same as previous section.

672

685

The number of configurations $p$ is the same as previous section.

673

686

Table~\ref{tbl:gurobi_min_40} shows the results obtained by the filter solver for MIN/40.

674

687

Table~\ref{tbl:gurobi_min_40} shows the results obtained by the filter solver for MIN/40.

Table~\ref{tbl:gurobi_min_60} shows the results obtained by the filter solver for MIN/60.

675

688

Table~\ref{tbl:gurobi_min_60} shows the results obtained by the filter solver for MIN/60.

Table~\ref{tbl:gurobi_min_80} shows the results obtained by the filter solver for MIN/80.

676

689

Table~\ref{tbl:gurobi_min_80} shows the results obtained by the filter solver for MIN/80.

677

690

\renewcommand{\arraystretch}{1.4}

678

691

\renewcommand{\arraystretch}{1.4}

679

692

\begin{table}

680

693

\begin{table}

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/40}

681

694

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/40}

\label{tbl:gurobi_min_40}

682

695

\label{tbl:gurobi_min_40}

\centering

683

696

\centering

{\scalefont{0.77}

684

697

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

685

698

\begin{tabular}{|c|ccccc|c|c|}

\hline

686

699

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

687

700

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

688

701

\hline

1 & (27, 8, 0) & - & - & - & - & 41~dB & 648 \\

689

702

1 & (27, 8, 0) & - & - & - & - & 41~dB & 648 \\

2 & (3, 2, 14) & (19, 7, 0) & - & - & - & 40~dB & 263 \\

690

703

2 & (3, 2, 14) & (19, 7, 0) & - & - & - & 40~dB & 263 \\

3 & (3, 3, 15) & (11, 5, 0) & (3, 3, 0) & - & - & 41~dB & 192 \\

691

704

3 & (3, 3, 15) & (11, 5, 0) & (3, 3, 0) & - & - & 41~dB & 192 \\

4 & (3, 3, 15) & (3, 3, 0) & (3, 3, 0) & (3, 3, 0) & - & 42~dB & 147 \\

692

705

4 & (3, 3, 15) & (3, 3, 0) & (3, 3, 0) & (3, 3, 0) & - & 42~dB & 147 \\

\hline

693

706

\hline

\end{tabular}

694

707

\end{tabular}

}

695

708

}

\end{table}

696

709

\end{table}

697

710

\begin{table}

698

711

\begin{table}

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/60}

699

712

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/60}

\label{tbl:gurobi_min_60}

700

713

\label{tbl:gurobi_min_60}

\centering

701

714

\centering

{\scalefont{0.77}

702

715

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

703

716

\begin{tabular}{|c|ccccc|c|c|}

\hline

704

717

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

705

718

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

706

719

\hline

1 & (39, 13, 0) & - & - & - & - & 60~dB & 1131 \\

707

720

1 & (39, 13, 0) & - & - & - & - & 60~dB & 1131 \\

2 & (3, 3, 15) & (35, 10, 0) & - & - & - & 60~dB & 547 \\

708

721

2 & (3, 3, 15) & (35, 10, 0) & - & - & - & 60~dB & 547 \\

3 & (3, 3, 15) & (27, 8, 0) & (3, 3, 0) & - & - & 62~dB & 426 \\

709

722

3 & (3, 3, 15) & (27, 8, 0) & (3, 3, 0) & - & - & 62~dB & 426 \\

4 & (3, 2, 14) & (11, 5, 1) & (11, 5, 0) & (3, 3, 0) & - & 60~dB & 344 \\

710

723

4 & (3, 2, 14) & (11, 5, 1) & (11, 5, 0) & (3, 3, 0) & - & 60~dB & 344 \\

5 & (3, 2, 14) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & (3, 3, 0) & 60~dB & 279 \\

711

724

5 & (3, 2, 14) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & (3, 3, 0) & 60~dB & 279 \\

\hline

712

725

\hline

\end{tabular}

713

726

\end{tabular}

}

714

727

}

\end{table}

715

728

\end{table}

716

729

\begin{table}

717

730

\begin{table}

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/80}

718

731

\caption{Configurations $(C_i, \pi_i^C, \pi_i^S)$, rejections and areas (in arbitrary units) for MIN/80}

\label{tbl:gurobi_min_80}

719

732

\label{tbl:gurobi_min_80}

\centering

720

733

\centering

{\scalefont{0.77}

721

734

{\scalefont{0.77}

\begin{tabular}{|c|ccccc|c|c|}

722

735

\begin{tabular}{|c|ccccc|c|c|}

\hline

723

736

\hline

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

724

737

$n$ & $i = 1$ & $i = 2$ & $i = 3$ & $i = 4$ & $i = 5$ & Rejection & Area \\

\hline

725

738

\hline

1 & (55, 16, 0) & - & - & - & - & 81~dB & 1760 \\

726

739

1 & (55, 16, 0) & - & - & - & - & 81~dB & 1760 \\

2 & (3, 3, 15) & (47, 14, 0) & - & - & - & 80~dB & 903 \\

727

740

2 & (3, 3, 15) & (47, 14, 0) & - & - & - & 80~dB & 903 \\

3 & (3, 3, 15) & (23, 9, 0) & (19, 7, 0) & - & - & 80~dB & 698 \\

728

741

3 & (3, 3, 15) & (23, 9, 0) & (19, 7, 0) & - & - & 80~dB & 698 \\

4 & (3, 3, 15) & (27, 9, 0) & (7, 7, 4) & (3, 3, 0) & - & 80~dB & 605 \\

729

742

4 & (3, 3, 15) & (27, 9, 0) & (7, 7, 4) & (3, 3, 0) & - & 80~dB & 605 \\

5 & (3, 2, 14) & (27, 8, 0) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & 81~dB & 534 \\

730

743

5 & (3, 2, 14) & (27, 8, 0) & (3, 3, 1) & (3, 3, 0) & (3, 3, 0) & 81~dB & 534 \\

\hline

731

744

\hline

\end{tabular}

732

745

\end{tabular}

}

733

746

}

\end{table}

734

747

\end{table}

\renewcommand{\arraystretch}{1}

735

748

\renewcommand{\arraystretch}{1}

736

749

From these tables, we can first state that all configuration reach the target rejection

737

750

From these tables, we can first state that all configuration reach the target rejection

level and more we have stages lesser is the area occupied in arbitrary unit.

738

751

level and more we have stages lesser is the area occupied in arbitrary unit.

Futhermore, the area of the monolithic filter is twice bigger than the two cascaded.

739

752

Futhermore, the area of the monolithic filter is twice bigger than the two cascaded.

More generally, more there is filters lower is the occupied area.

740

753

More generally, more there is filters lower is the occupied area.

741

754

Like in previous section, the solver choose always a little filter as first

742

755

Like in previous section, the solver choose always a little filter as first

filter stage and the second one is often the biggest filter. this choice can be explain

743

756

filter stage and the second one is often the biggest filter. this choice can be explain

as the previous section. The solver uses just enough bits to not degrade the input

744

757

as the previous section. The solver uses just enough bits to not degrade the input

signal and in second filter it can choose a better filter to improve rejection without

745

758

signal and in second filter it can choose a better filter to improve rejection without

have too bits in the output data.

746

759

have too bits in the output data.

747

760

For the specific case in MIN/40 for $n = 5$ the solver has determined that the optimal

748

761

For the specific case in MIN/40 for $n = 5$ the solver has determined that the optimal

number of filter is 4 so it not chose any configuration in last filter. Hence this

749

762

number of filter is 4 so it not chose any configuration in last filter. Hence this

solution is equivalent to the result for $n = 4$.

750

763

solution is equivalent to the result for $n = 4$.

751

764

The following graphs present the rejection for real data on the FPGA. In all following

752

765

The following graphs present the rejection for real data on the FPGA. In all following

figures, the solid line represents the actual rejection of the filtered

753

766

figures, the solid line represents the actual rejection of the filtered

data on the FPGA as measured experimentally and the dashed line are the noise level

754

767

data on the FPGA as measured experimentally and the dashed line are the noise level

given by the quadratic solver.

755

768

given by the quadratic solver.

756

769

Figure~\ref{fig:min_40} shows the rejection of the different configurations in the case of MIN/40.

757

770

Figure~\ref{fig:min_40} shows the rejection of the different configurations in the case of MIN/40.

Figure~\ref{fig:min_60} shows the rejection of the different configurations in the case of MIN/60.

758

771

Figure~\ref{fig:min_60} shows the rejection of the different configurations in the case of MIN/60.

Figure~\ref{fig:min_80} shows the rejection of the different configurations in the case of MIN/80.

759

772

Figure~\ref{fig:min_80} shows the rejection of the different configurations in the case of MIN/80.

760

773

\begin{figure}

761

774

\begin{figure}

\centering

762

775

\centering

\includegraphics[width=\linewidth]{images/min_40}

763

776

\includegraphics[width=\linewidth]{images/min_40}

\caption{Signal spectrum for MIN/40}

764

777

\caption{Signal spectrum for MIN/40}

GITLAB

jfriedt / IFCS2018 article

Rajout de la pyramide de rejection.