jfriedt / IFCS2018 article

Compare View

Commits (2)

48d886be9 Correction des notations. Browse Code »

Arthur HUGEAT
2018-05-21 01:33:13 +0200
9db1d56ab Merge branch 'master' of https://lxsd.femto-st.fr/gitlab/jfriedt/ifcs2018-article Browse Code »

Arthur HUGEAT
2018-05-21 01:34:52 +0200

Diff

Showing 1 changed file Inline Diff

ifcs2018_proceeding.tex

ifcs2018_proceeding.tex

\documentclass[a4paper,conference]{IEEEtran/IEEEtran}	1	1	% JMF : revoir l'abstract : on y avait mis le Zynq7010 de la redpitaya en montrant
\usepackage{graphicx,color,hyperref}	2	2	% comment optimiser les perfs a surface finie. Ici aussi on tombait dans le cas ou`
\usepackage{amsfonts}	3	3	% la solution a 1 seul FIR n'etait simplement pas synthetisable => fusionner les deux
\usepackage{amsthm}	4	4	% contributions pour le papier TUFFC
\usepackage{amssymb}	5	5
\usepackage{amsmath}	6	6	\documentclass[a4paper,conference]{IEEEtran/IEEEtran}
\usepackage{algorithm2e}	7	7	\usepackage{graphicx,color,hyperref}
\usepackage{url,balance}	8	8	\usepackage{amsfonts}
\usepackage[normalem]{ulem}	9	9	\usepackage{amsthm}
% correct bad hyphenation here	10	10	\usepackage{amssymb}
\hyphenation{op-tical net-works semi-conduc-tor}	11	11	\usepackage{amsmath}
\textheight=26cm	12	12	\usepackage{algorithm2e}
\setlength{\footskip}{30pt}	13	13	\usepackage{url,balance}
\pagenumbering{gobble}	14	14	\usepackage[normalem]{ulem}
\begin{document}	15	15	% correct bad hyphenation here
\title{Filter optimization for real time digital processing of radiofrequency signals: application	16	16	\hyphenation{op-tical net-works semi-conduc-tor}
to oscillator metrology}	17	17	\textheight=26cm
	18	18	\setlength{\footskip}{30pt}
\author{\IEEEauthorblockN{A. Hugeat\IEEEauthorrefmark{1}\IEEEauthorrefmark{2}, J. Bernard\IEEEauthorrefmark{2},	19	19	\pagenumbering{gobble}
G. Goavec-M\'erou\IEEEauthorrefmark{1},	20	20	\begin{document}
P.-Y. Bourgeois\IEEEauthorrefmark{1}, J.-M. Friedt\IEEEauthorrefmark{1}}	21	21	\title{Filter optimization for real time digital processing of radiofrequency signals: application
\IEEEauthorblockA{\IEEEauthorrefmark{1}FEMTO-ST, Time \& Frequency department, Besan\c con, France }	22	22	to oscillator metrology}
\IEEEauthorblockA{\IEEEauthorrefmark{2}FEMTO-ST, Computer Science department DISC, Besan\c con, France \\	23	23
Email: \{pyb2,jmfriedt\}@femto-st.fr}	24	24	\author{\IEEEauthorblockN{A. Hugeat\IEEEauthorrefmark{1}\IEEEauthorrefmark{2}, J. Bernard\IEEEauthorrefmark{2},
}	25	25	G. Goavec-M\'erou\IEEEauthorrefmark{1},
\maketitle	26	26	P.-Y. Bourgeois\IEEEauthorrefmark{1}, J.-M. Friedt\IEEEauthorrefmark{1}}
\thispagestyle{plain}	27	27	\IEEEauthorblockA{\IEEEauthorrefmark{1}FEMTO-ST, Time \& Frequency department, Besan\c con, France }
\pagestyle{plain}	28	28	\IEEEauthorblockA{\IEEEauthorrefmark{2}FEMTO-ST, Computer Science department DISC, Besan\c con, France \\
\newtheorem{definition}{Definition}	29	29	Email: \{pyb2,jmfriedt\}@femto-st.fr}
	30	30	}
\begin{abstract}	31	31	\maketitle
Software Defined Radio (SDR) provides stability, flexibility and reconfigurability to	32	32	\thispagestyle{plain}
radiofrequency signal processing. Applied to oscillator characterization in the context	33	33	\pagestyle{plain}
of ultrastable clocks, stringent filtering requirements are defined by spurious signal or	34	34	\newtheorem{definition}{Definition}
noise rejection needs. Since real time radiofrequency processing must be performed in a	35	35
Field Programmable Array to meet timing constraints, we investigate optimization strategies	36	36	\begin{abstract}
to design filters meeting rejection characteristics while limiting the hardware resources	37	37	Software Defined Radio (SDR) provides stability, flexibility and reconfigurability to
required and keeping timing constraints within the targeted measurement bandwidths.	38	38	radiofrequency signal processing. Applied to oscillator characterization in the context
\end{abstract}	39	39	of ultrastable clocks, stringent filtering requirements are defined by spurious signal or
	40	40	noise rejection needs. Since real time radiofrequency processing must be performed in a
\begin{IEEEkeywords}	41	41	Field Programmable Array to meet timing constraints, we investigate optimization strategies
Software Defined Radio, Mixed-Integer Linear Programming, Finite Impulse Response filter	42	42	to design filters meeting rejection characteristics while limiting the hardware resources
\end{IEEEkeywords}	43	43	required and keeping timing constraints within the targeted measurement bandwidths.
	44	44	\end{abstract}
\section{Digital signal processing of ultrastable clock signals}	45	45
	46	46	\begin{IEEEkeywords}
Analog oscillator phase noise characteristics are classically performed by downconverting	47	47	Software Defined Radio, Mixed-Integer Linear Programming, Finite Impulse Response filter
the radiofrequency signal using a saturated mixer to bring the radiofrequency signal to baseband,	48	48	\end{IEEEkeywords}
followed by a Fourier analysis of the beat signal to analyze phase fluctuations close to carrier. In	49	49
a fully digital approach, the radiofrequency signal is digitized and numerically downconverted by	50	50	\section{Digital signal processing of ultrastable clock signals}
multiplying the samples with a local numerically controlled oscillator (Fig. \ref{schema}) \cite{rsi}.	51	51
	52	52	Analog oscillator phase noise characteristics are classically performed by downconverting
\begin{figure}[h!tb]	53	53	the radiofrequency signal using a saturated mixer to bring the radiofrequency signal to baseband,
\begin{center}	54	54	followed by a Fourier analysis of the beat signal to analyze phase fluctuations close to carrier. In
\includegraphics[width=.8\linewidth]{images/schema}	55	55	a fully digital approach, the radiofrequency signal is digitized and numerically downconverted by
\end{center}	56	56	multiplying the samples with a local numerically controlled oscillator (Fig. \ref{schema}) \cite{rsi}.
\caption{Fully digital oscillator phase noise characterization: the Device Under Test	57	57
(DUT) signal is sampled by the radiofrequency grade Analog to Digital Converter (ADC) and	58	58	\begin{figure}[h!tb]
downconverted by mixing with a Numerically Controlled Oscillator (NCO). Unwanted signals	59	59	\begin{center}
and noise aliases are rejected by a Low Pass Filter (LPF) implemented as a cascade of Finite	60	60	\includegraphics[width=.8\linewidth]{images/schema}
Impulse Response (FIR) filters. The signal is then decimated before a Fourier analysis displays	61	61	\end{center}
the spectral characteristics of the phase fluctuations.}	62	62	\caption{Fully digital oscillator phase noise characterization: the Device Under Test
\label{schema}	63	63	(DUT) signal is sampled by the radiofrequency grade Analog to Digital Converter (ADC) and
\end{figure}	64	64	downconverted by mixing with a Numerically Controlled Oscillator (NCO). Unwanted signals
	65	65	and noise aliases are rejected by a Low Pass Filter (LPF) implemented as a cascade of Finite
As with the analog mixer,	66	66	Impulse Response (FIR) filters. The signal is then decimated before a Fourier analysis displays
the non-linear behavior of the downconverter introduces noise or spurious signal aliasing as	67	67	the spectral characteristics of the phase fluctuations.}
well as the generation of the frequency sum signal in addition to the frequency difference.	68	68	\label{schema}
These unwanted spectral characteristics must be rejected before decimating the data stream	69	69	\end{figure}
for the phase noise spectral characterization. The characteristics introduced between the	70	70
downconverter	71	71	As with the analog mixer,
and the decimation processing blocks are core characteristics of an oscillator characterization	72	72	the non-linear behavior of the downconverter introduces noise or spurious signal aliasing as
system, and must reject out-of-band signals below the targeted phase noise -- typically in the	73	73	well as the generation of the frequency sum signal in addition to the frequency difference.
sub -170~dBc/Hz for ultrastable oscillator we aim at characterizing. The filter blocks will	74	74	These unwanted spectral characteristics must be rejected before decimating the data stream
use most resources of the Field Programmable Gate Array (FPGA) used to process the radiofrequency	75	75	for the phase noise spectral characterization \cite{andrich2018high}. The characteristics introduced between the
datastream: optimizing the performance of the filter while reducing the needed resources is	76	76	downconverter
hence tackled in a systematic approach using optimization techniques. Most significantly, we	77	77	and the decimation processing blocks are core characteristics of an oscillator characterization
tackle the issue by attempting to cascade multiple Finite Impulse Response (FIR) filters with	78	78	system, and must reject out-of-band signals below the targeted phase noise -- typically in the
tunable number of coefficients and tunable number of bits representing the coefficients and the	79	79	sub -170~dBc/Hz for ultrastable oscillator we aim at characterizing. The filter blocks will
data being processed.	80	80	use most resources of the Field Programmable Gate Array (FPGA) used to process the radiofrequency
	81	81	datastream: optimizing the performance of the filter while reducing the needed resources is
\section{Finite impulse response filter}	82	82	hence tackled in a systematic approach using optimization techniques. Most significantly, we
	83	83	tackle the issue by attempting to cascade multiple Finite Impulse Response (FIR) filters with
We select FIR filter for their unconditional stability and ease of design. A FIR filter is defined	84	84	tunable number of coefficients and tunable number of bits representing the coefficients and the
by a set of weights $b_k$ applied to the inputs $x_k$ through a convolution to generate the	85	85	data being processed.
outputs $y_k$	86	86
$$y_n=\sum_{k=0}^N b_k x_{n-k}$$	87	87	\section{Finite impulse response filter}
	88	88
As opposed to an implementation on a general purpose processor in which word size is defined by the	89	89	We select FIR filter for their unconditional stability and ease of design. A FIR filter is defined
processor architecture, implementing such a filter on an FPGA offer more degrees of freedom since	90	90	by a set of weights $b_k$ applied to the inputs $x_k$ through a convolution to generate the
not only the coefficient values and number of taps must be defined, but also the number of bits	91	91	outputs $y_k$
defining the coefficients and the sample size. For this reason, and because we consider pipeline	92	92	$$y_n=\sum_{k=0}^N b_k x_{n-k}$$
processing (as opposed to First-In, First-Out FIFO memory batch processing) of radiofrequency	93	93
signals, High Level Synthesis (HLS) languages \cite{kasbah2008multigrid} are not considered but	94	94	As opposed to an implementation on a general purpose processor in which word size is defined by the
the problem is tackled at the Very-high-speed-integrated-circuit Hardware Description Language (VHDL).	95	95	processor architecture, implementing such a filter on an FPGA offer more degrees of freedom since
Since latency is not an issue in a openloop phase noise characterization instrument, the large	96	96	not only the coefficient values and number of taps must be defined, but also the number of bits
numbre of taps in the FIR, as opposed to the shorter Infinite Impulse Response (IIR) filter,	97	97	defining the coefficients and the sample size. For this reason, and because we consider pipeline
is not considered as an issue as would be in a closed loop system.	98	98	processing (as opposed to First-In, First-Out FIFO memory batch processing) of radiofrequency
	99	99	signals, High Level Synthesis (HLS) languages \cite{kasbah2008multigrid} are not considered but
The coefficients are classically expressed as floating point values. However, this binary	100	100	the problem is tackled at the Very-high-speed-integrated-circuit Hardware Description Language (VHDL) level.
number representation is not efficient for fast arithmetic computation by an FPGA. Instead,	101	101	Since latency is not an issue in a openloop phase noise characterization instrument, the large
we select to quantify these floating point values into integer values. This quantization	102	102	numbre of taps in the FIR, as opposed to the shorter Infinite Impulse Response (IIR) filter,
will result in some precision loss.	103	103	is not considered as an issue as would be in a closed loop system.
	104	104
%As illustrated in Fig. \ref{float_vs_int}, we see that we aren't	105	105	The coefficients are classically expressed as floating point values. However, this binary
%need too coefficients or too sample size. If we have lot of coefficients but a small sample size,	106	106	number representation is not efficient for fast arithmetic computation by an FPGA. Instead,
%the first and last are equal to zero. But if we have too sample size for few coefficients that not improve the quality.	107	107	we select to quantify these floating point values into integer values. This quantization
	108	108	will result in some precision loss.
% JMF je ne comprends pas la derniere phrase ci-dessus ni la figure ci dessous	109	109
% AH en gros je voulais dire que prendre trop peu de bit avec trop de coeff, ça induit ta figure (bien mieux faite que moi)	110	110	%As illustrated in Fig. \ref{float_vs_int}, we see that we aren't
% et que l'inverse trop de bit sur pas assez de coeff on ne gagne rien, je vais essayer de la reformuler	111	111	%need too coefficients or too sample size. If we have lot of coefficients but a small sample size,
	112	112	%the first and last are equal to zero. But if we have too sample size for few coefficients that not improve the quality.
%\begin{figure}[h!tb]	113	113
%\includegraphics[width=\linewidth]{images/float-vs-integer.pdf}	114	114	% JMF je ne comprends pas la derniere phrase ci-dessus ni la figure ci dessous
%\caption{Impact of the quantization resolution of the coefficients}	115	115	% AH en gros je voulais dire que prendre trop peu de bit avec trop de coeff, ça induit ta figure (bien mieux faite que moi)
%\label{float_vs_int}	116	116	% et que l'inverse trop de bit sur pas assez de coeff on ne gagne rien, je vais essayer de la reformuler
%\end{figure}	117	117
	118	118	%\begin{figure}[h!tb]
\begin{figure}[h!tb]	119	119	%\includegraphics[width=\linewidth]{images/float-vs-integer.pdf}
\includegraphics[width=\linewidth]{images/demo_filtre}	120	120	%\caption{Impact of the quantization resolution of the coefficients}
\caption{Impact of the quantization resolution of the coefficients: the quantization is	121	121	%\label{float_vs_int}
set to 6~bits -- with the horizontal black lines indicating $\pm$1 least significant bit -- setting	122	122	%\end{figure}
the 30~first and 30~last coefficients out of the initial 128~band-pass	123	123
filter coefficients to 0 (red dots).}	124	124	\begin{figure}[h!tb]
\label{float_vs_int}	125	125	\includegraphics[width=\linewidth]{images/demo_filtre}
\end{figure}	126	126	\caption{Impact of the quantization resolution of the coefficients: the quantization is
	127	127	set to 6~bits -- with the horizontal black lines indicating $\pm$1 least significant bit -- setting
The tradeoff between quantization resolution and number of coefficients when considering	128	128	the 30~first and 30~last coefficients out of the initial 128~band-pass
integer operations is not trivial. As an illustration of the issue related to the	129	129	filter coefficients to 0 (red dots).}
relation between number of fiter taps and quantization, Fig. \ref{float_vs_int} exhibits	130	130	\label{float_vs_int}
a 128-coefficient FIR bandpass filter designed using floating point numbers (blue). Upon	131	131	\end{figure}
quantization on 6~bit integers, 60 of the 128~coefficients in the beginning and end of the	132	132
taps become null, making the large number of coefficients irrelevant and allowing to save	133	133	The tradeoff between quantization resolution and number of coefficients when considering
processing resource by shrinking the filter length. This tradeoff aimed at minimizing resources	134	134	integer operations is not trivial. As an illustration of the issue related to the
to reach a given rejection level, or maximizing out of band rejection for a given computational	135	135	relation between number of fiter taps and quantization, Fig. \ref{float_vs_int} exhibits
resource, will drive the investigation on cascading filters designed with varying tap resolution	136	136	a 128-coefficient FIR bandpass filter designed using floating point numbers (blue). Upon
and tap length, as will be shown in the next section. Indeed, our development strategy closely	137	137	quantization on 6~bit integers, 60 of the 128~coefficients in the beginning and end of the
follows the skeleton approach \cite{crookes1998environment, crookes2000design, benkrid2002towards}	138	138	taps become null, making the large number of coefficients irrelevant and allowing to save
in which basic blocks are defined and characterized before being assembled \cite{hide}	139	139	processing resource by shrinking the filter length. This tradeoff aimed at minimizing resources
in a complete processing chain. In our case, assembling the filter blocks is a simpler block	140	140	to reach a given rejection level, or maximizing out of band rejection for a given computational
combination process since we assume a single value to be processed and a single value to be	141	141	resource, will drive the investigation on cascading filters designed with varying tap resolution
generated at each clock cycle. The FIR filters will not be considered to decimate in the	142	142	and tap length, as will be shown in the next section. Indeed, our development strategy closely
current implementation: the decimation is assumed to be located after the FIR cascade at the	143	143	follows the skeleton approach \cite{crookes1998environment, crookes2000design, benkrid2002towards}
moment.	144	144	in which basic blocks are defined and characterized before being assembled \cite{hide}
	145	145	in a complete processing chain. In our case, assembling the filter blocks is a simpler block
\section{Filter optimization}	146	146	combination process since we assume a single value to be processed and a single value to be
	147	147	generated at each clock cycle. The FIR filters will not be considered to decimate in the
A basic approach for implementing the FIR filter is to compute the transfer function of	148	148	current implementation: the decimation is assumed to be located after the FIR cascade at the
a monolithic filter: this single filter defines all coefficients with the same resolution	149	149	moment.
(number of bits) and processes data represented with their own resolution. Meeting the	150	150
filter shape requires a large number of coefficients, limited by resources of the FPGA since	151	151	\section{Filter optimization}
this filter must process data stream at the radiofrequency sampling rate after the mixer.	152	152
	153	153	A basic approach for implementing the FIR filter is to compute the transfer function of
An optimization problem \cite{leung2004handbook} aims at improving one or many	154	154	a monolithic filter: this single filter defines all coefficients with the same resolution
performance criteria within a constrained resource environment. Amongst the tools	155	155	(number of bits) and processes data represented with their own resolution. Meeting the
developed to meet this aim, Mixed-Integer Linear Programming (MILP) provides the framework to	156	156	filter shape requires a large number of coefficients, limited by resources of the FPGA since
formally define the stated problem and search for an optimal use of available	157	157	this filter must process data stream at the radiofrequency sampling rate after the mixer.
resources \cite{yu2007design, kodek1980design}.	158	158
	159	159	An optimization problem \cite{leung2004handbook} aims at improving one or many
First we need to ensure that our problem is a real optimization problem. When	160	160	performance criteria within a constrained resource environment. Amongst the tools
designing a processing function in the FPGA, we aim at meeting some requirement such as	161	161	developed to meet this aim, Mixed-Integer Linear Programming (MILP) provides the framework to
the throughput, the computation time or the noise rejection noise. However, due to limited	162	162	formally define the stated problem and search for an optimal use of available
resources to design the process like BRAM (high performance RAM), DSP (Digital Signal Processor)	163	163	resources \cite{yu2007design, kodek1980design}.
or LUT (Look Up Table), a tradeoff must be generally searched between performance and available	164	164
computational resources: optimizing some criteria within finite, limited	165	165	First we need to ensure that our problem is a real optimization problem. When
resources indeed matches the definition of a classical optimization problem.	166	166	designing a processing function in the FPGA, we aim at meeting some requirement such as
	167	167	the throughput, the computation time or the noise rejection noise. However, due to limited
Specifically the degrees of freedom when addressing the problem of replacing the single monolithic	168	168	resources to design the process like BRAM (high performance RAM), DSP (Digital Signal Processor)
FIR with a cascade of optimized filters are the number of coefficients $N_i$ of each filter $i$ and	169	169	or LUT (Look Up Table), a tradeoff must be generally searched between performance and available
the number of bits $C_i$ representing the coefficients. Because each FIR in the chain is fed the output of the previous stage,	170	170	computational resources: optimizing some criteria within finite, limited
the optimization of the complete processing chain within a constrained resource environment is not	171	171	resources indeed matches the definition of a classical optimization problem.
trivial. The resource occupation of a FIR filter is considered as $C_i \times N_i$ which is	172	172
the number of bits needed in a worst case condition to represent the output of the FIR. Such an	173	173	Specifically the degrees of freedom when addressing the problem of replacing the single monolithic
occupied area estimate assumes that the number of gates scales as the number of bits and the number	174	174	FIR with a cascade of optimized filters are the number of coefficients $N_i$ of each filter $i$ and
of coefficients, but does not account for the detailed implementation of the hardware. Indeed,	175	175	the number of bits $C_i$ representing the coefficients. Because each FIR in the chain is fed the output of the previous stage,
various FPGA implementations will provide different hardware functionalities, and we shall consider	176
at the end of the design a synthesis step using vendor software to assess the validity of the solution	177	176	the optimization of the complete processing chain within a constrained resource environment is not
found. As an example of the limitation linked to the lack of detailed hardware consideration, Block Random	178	177	trivial. The resource occupation of a FIR filter is considered as $C_i \times N_i$ which is
Access Memory (BRAM) used to store filter coefficients are not shared amongst filters, and multiplications	179	178	the number of bits needed in a worst case condition to represent the output of the FIR. Such an
are most efficiently implemented by using DSP blocks whose input word	180	179	occupied area estimate assumes that the number of gates scales as the number of bits and the number
size is finite. DSPs are a scarce resource to be saved in a practical implementation. Keeping a high	181	180	of coefficients, but does not account for the detailed implementation of the hardware. Indeed,
abstraction on the resource occupation is nevertheless selected in the following discussion in order	182	181	various FPGA implementations will provide different hardware functionalities, and we shall consider
to leave enough degrees of freedom in the problem to try and find original solutions: too many	183	182	at the end of the design a synthesis step using vendor software to assess the validity of the solution
constraints in the initial statement of the problem leave little room for finding an optimal solution.	184	183	found. As an example of the limitation linked to the lack of detailed hardware consideration, Block Random
	185	184	Access Memory (BRAM) used to store filter coefficients are not shared amongst filters, and multiplications
\begin{figure}[h!tb]	186	185	are most efficiently implemented by using DSP blocks whose input word
\begin{center}	187	186	size is finite. DSPs are a scarce resource to be saved in a practical implementation. Keeping a high
\includegraphics[width=.5\linewidth]{schema2}	188	187	abstraction on the resource occupation is nevertheless selected in the following discussion in order
\caption{Shape of the filter transmitted power $P$ as a function of frequency:	189	188	to leave enough degrees of freedom in the problem to try and find original solutions: too many
the bandpass BP is considered to occupy the initial	190	189	constraints in the initial statement of the problem leave little room for finding an optimal solution.
40\% of the Nyquist frequency range, the stopband the last 40\%, allowing 20\% transition	191	190
width.}	192	191	\begin{figure}[h!tb]
\label{rejection-shape}	193	192	\begin{center}
\end{center}	194	193	\includegraphics[width=.5\linewidth]{schema2}
\end{figure}	195	194	\caption{Shape of the filter transmitted power $P$ as a function of frequency:
	196	195	the bandpass BP is considered to occupy the initial
Following these considerations, the model is expressed as:	197	196	40\% of the Nyquist frequency range, the stopband the last 40\%, allowing 20\% transition
\begin{align}	198	197	width.}
\begin{cases}	199	198	\label{rejection-shape}
\mathcal{R}_i &= \mathcal{F}(N_i, C_i)\\	200	199	\end{center}
\mathcal{A}_i &= N_i * C_i\\	201	200	\end{figure}
\Delta_i &= \Delta _{i-1} + \mathcal{P}_i	202	201
\end{cases}	203	202	Following these considerations, the model is expressed as:
\label{model-FIR}	204	203	\begin{align}
\end{align}	205	204	\begin{cases}
To explain the system \ref{model-FIR}, $\mathcal{R}_i$ represents the rejection of depending on $N_i$ and $C_i$, $\mathcal{A}$	206	205	\mathcal{R}_i &= \mathcal{F}(N_i, C_i)\\
is a theoretical area occupation of the processing block on the FPGA, and $\Delta_i$ is the total rejection for the current stage $i$.	207	206	\mathcal{A}_i &= N_i * C_i\\
Since the function $\mathcal{F}$ cannot be explictly expressed, we run simulations to determine the rejection depending	208	207	\Delta_i &= \Delta _{i-1} + \mathcal{P}_i
on $N_i$ and $C_i$. However, selecting the right filter requires a clear definition of the rejection criterion. Selecting an	209	208	\end{cases}
incorrect criterion will lead the linear program solver to produce a solution which might not meet the user requirements.	210	209	\label{model-FIR}
Hence, amongst various criteria including the mean or median value of the FIR response in the stopband as will	211	210	\end{align}
be illustrated lated (section \ref{median}), we have designed	212	211	To explain the system \ref{model-FIR}, $\mathcal{R}_i$ represents the stopband rejection dependence with $N_i$ and $C_i$, $\mathcal{A}$
a criterion aimed at avoiding ripples in the passband and considering the maximum of the FIR spectral response in the stopband	213	212	is a theoretical area occupation of the processing block on the FPGA as discussed earlier, and $\Delta_i$ is the total rejection for the current stage $i$.
(Fig. \ref{rejection-shape}). The bandpass criterion is defined as the sum of the absolute values of the spectral response	214	213	Since the function $\mathcal{F}$ cannot be explictly expressed, we run simulations to determine the rejection depending
in the bandpass, reminiscent of a standard deviation of the spectral response: this criterion must be minimized to avoid	215	214	on $N_i$ and $C_i$. However, selecting the right filter requires a clear definition of the rejection criterion. Selecting an
ripples in the passband. The stopband transfer function maximum must also be minimized in order to improve the filter	216	215	incorrect criterion will lead the linear program solver to produce a solution which might not meet the user requirements.
rejection capability. Weighing these two criteria allows designing the linear program to be solved.	217	216	Hence, amongst various criteria including the mean or median value of the FIR response in the stopband as will
	218	217	be illustrated lated (section \ref{median}), we have designed
\begin{figure}[h!tb]	219	218	a criterion aimed at avoiding ripples in the passband and considering the maximum of the FIR spectral response in the stopband
\includegraphics[width=\linewidth]{images/noise-rejection.pdf}	220	219	(Fig. \ref{rejection-shape}). The bandpass criterion is defined as the sum of the absolute values of the spectral response
\caption{Rejection as a function of number of coefficients and number of bits}	221	220	in the bandpass, reminiscent of a standard deviation of the spectral response: this criterion must be minimized to avoid
\label{noise-rejection}	222	221	ripples in the passband. The stopband transfer function maximum must also be minimized in order to improve the filter
\end{figure}	223	222	rejection capability. Weighing these two criteria allows designing the linear program to be solved.
	224	223
The objective function maximizes the noise rejection ($\max(\Delta_{i_{\max}})$) while keeping resource occupation below	225	224	\begin{figure}[h!tb]
a user-defined threshold. The MILP solver is allowed to choose the number of successive	226	225	\includegraphics[width=\linewidth]{images/noise-rejection.pdf}
filters, within an upper bound. The last problem is to model the noise rejection. Since filter	227	226	\caption{Rejection as a function of number of coefficients and number of bits}
noise rejection capability is not modeled with linear equations, a look-up-table is generated	228	227	\label{noise-rejection}
for multiple filter configurations in which the $C_i$, $D_i$ and $N_i$ parameters are varied: for each	229	228	\end{figure}
one of these conditions, the low-pass filter rejection defined as the mean power between	230	229
half the Nyquist frequency and the Nyquist frequency is stored as computed by the frequency response	231	230	The objective function maximizes the noise rejection ($\max(\Delta_{i_{\max}})$) while keeping resource occupation below
of the digital filter (Fig. \ref{noise-rejection}). An intuitive analysis of this chart hints at an optimum	232	231	a user-defined threshold, or aims at minimizing the area needed to reach a given rejection ($\min(S_q)$ in
set of tap length and number of bit for representing the coefficients along the line of the pyramidal	233	232	the forthcoming discussion, Eqs. \ref{cstr_size} and \ref{cstr_rejection}).
shaped rejection capability function.	234	233	The MILP solver is allowed to choose the number of successive
	235	234	filters, within an upper bound. The last problem is to model the noise rejection. Since filter
Linear program formalism for solving the problem is well documented: an objective function is	236	235	noise rejection capability is not modeled with linear equations, a look-up-table is generated
defined which is linearly dependent on the parameters to be optimized. Constraints are expressed	237	236	for multiple filter configurations in which the $C_i$, $D_i$ and $N_i$ parameters are varied: for each
as linear equation and solved using one of the available solvers, in our case GLPK\cite{glpk}.	238	237	one of these conditions, the low-pass filter rejection is stored as computed by the frequency response
With the notation explain in system \ref{model-FIR}, we have defined our linear problem like this:	239	238	of the digital filter (Fig. \ref{noise-rejection}). Various rejection criteria have been investigated,
\paragraph{Variables}	240	239	including mean value of the stopband response, median value of the stopband response, or as finally
\begin{align*}	241	240	selected, maximum value in the stopband. An intuitive analysis of the chart of Fig. \ref{noise-rejection}
x_{i,j} \in \lbrace 0,1 \rbrace & \text{ $i$ is a given filter} \\	242	241	hints at an optimum
& \text{ $j$ is the stage} \\	243	242	set of tap length and number of bit for representing the coefficients along the line of the pyramidal
& \text{ If $x_{i,j}$ is equal to 1, the filter is selected} \\	244	243	shaped rejection capability function.
\end{align*}	245	244
\paragraph{Constants}	246	245	Linear program formalism for solving the problem is well documented: an objective function is
\begin{align*}	247	246	defined which is linearly dependent on the parameters to be optimized. Constraints are expressed
\mathcal{F} = \lbrace F_1 ... F_p \rbrace & \text{ All possible filters}\\	248	247	as linear equation and solved using one of the available solvers, in our case GLPK\cite{glpk}.
& \text{ $p$ is the number of different filters} \\	249	248	With the notation explain in system \ref{model-FIR}, we have defined our linear problem like this:
% N(i) & \text{ % Constant to let the	250	249	\paragraph{Variables}
% number of coefficients %} \\ & \text{	251	250	\begin{align*}
% for filter $i$}\\	252	251	x_{i,j} \in \lbrace 0,1 \rbrace & \text{ $i$ is a given filter} \\
% C(i) & \text{ % Constant to let the	253	252	& \text{ $j$ is the stage} \\
% number of bits of %}\\ & \text{	254	253	& \text{ If $x_{i,j}$ is equal to 1, the filter is selected} \\
% each coefficient for filter $i$}\\	255	254	\end{align*}
\mathcal{S}_{\max} & \text{ Total space available inside the FPGA}	256	255	\paragraph{Constants}
\end{align*}	257	256	\begin{align*}
\paragraph{Constraints}	258	257	\mathcal{F} = \lbrace F_1 ... F_p \rbrace & \text{ All possible filters}\\
\begin{align}	259	258	& \text{ $p$ is the number of different filters} \\
1 \leq i \leq p & \nonumber\\	260	259	% N(i) & \text{ % Constant to let the
1 \leq j \leq q & \text{ $q$ is the max of filter stage} \nonumber \\	261	260	% number of coefficients %} \\ & \text{
\forall j, \mathlarger{\sum_{i}} x_{i,j} = 1 & \text{ At most one filter by stage} \nonumber\\	262	261	% for filter $i$}\\
\mathcal{S}_0 = 0 & \text{ initial occupation} \nonumber\\	263	262	% C(i) & \text{ % Constant to let the
\forall j, \mathcal{S}_j = \mathcal{S}_{j-1} + \mathlarger{\sum_i (x_{i,j} \times \mathcal{A}_i)} \label{cstr_size} \\	264	263	% number of bits of %}\\ & \text{
\mathcal{S} \leq \mathcal{S}_{\max}\nonumber \\	265	264	% each coefficient for filter $i$}\\
\mathcal{N}_0 = 0 & \text{ initial rejection}\nonumber\\	266	265	\mathcal{S}_{\max} & \text{ Total space available inside the FPGA}
\forall j, \mathcal{N}_j = \mathcal{N}_{j-1} + \mathlarger{\sum_i (x_{i,j} \times \mathcal{R}_i)} \label{cstr_rejection} \\	267	266	\end{align*}
\mathcal{N}_q \geqslant 160 & \text{ an user defined bound}\nonumber\\	268	267	\paragraph{Constraints}
& \text{ (e.g. 160~dB here)}\nonumber\\\nonumber	269	268	\begin{align}
\end{align}	270	269	1 \leq i \leq p & \nonumber\\
\paragraph{Goal}	271	270	1 \leq j \leq q & \text{ $q$ is the max of filter stage} \nonumber \\
\begin{align*}	272	271	\forall j, \mathlarger{\sum_{i}} x_{i,j} = 1 & \text{ At most one filter by stage} \nonumber\\
\min \mathcal{S}_q	273	272	\mathcal{S}_0 = 0 & \text{ initial occupation} \nonumber\\
\end{align*}	274	273	\forall j, \mathcal{S}_j = \mathcal{S}_{j-1} + \mathlarger{\sum_i (x_{i,j} \times \mathcal{A}_i)} \label{cstr_size} \\
	275	274	\mathcal{S} \leq \mathcal{S}_{\max}\nonumber \\
The constraint \ref{cstr_size} means the occupation for the current stage $j$ depends on	276	275	\mathcal{N}_0 = 0 & \text{ initial rejection}\nonumber\\
the previous occupation and the occupation of current selected filter (it is possible	277	276	\forall j, \mathcal{N}_j = \mathcal{N}_{j-1} + \mathlarger{\sum_i (x_{i,j} \times \mathcal{R}_i)} \label{cstr_rejection} \\
that no filter is selected for this stage). And the second one \ref{cstr_rejection}	278	277	\mathcal{N}_q \geqslant 160 & \text{ an user defined bound}\nonumber\\
means the same thing but for the rejection, the rejection depends the previous rejection	279	278	& \text{ (e.g. 160~dB here)}\nonumber\\\nonumber
plus the rejection of selected filter.	280	279	\end{align}
	281	280	\paragraph{Goal}
\subsection{Low bandpass ripple and maximum rejection criteria}	282	281	\begin{align*}
	283	282	\min \mathcal{S}_q
The MILP solver provides a solution to the problem by selecting a series of small FIR with	284	283	\end{align*}
increasing number of bits representing data and coefficients as well as an increasing number	285	284
of coefficients, instead of a single monolithic filter.	286	285	The constraint \ref{cstr_size} means the occupation for the current stage $j$ depends on
	287	286	the previous occupation and the occupation of current selected filter (it is possible
\begin{figure}[h!tb]	288	287	that no filter is selected for this stage). And the second one \ref{cstr_rejection}
% \includegraphics[width=\linewidth]{images/compare-fir.pdf}	289	288	means the same thing but for the rejection, the rejection depends the previous rejection
\includegraphics[width=\linewidth]{images/fir-mono-vs-fir-series-noise-fixe-jmf-light.pdf}	290	289	plus the rejection of selected filter.
\caption{Comparison of the rejection capability between a series of FIR and a monolithic FIR	291	290
with a cutoff frequency set at half the Nyquist frequency.}	292	291	\subsection{Low bandpass ripple and maximum rejection criteria}
\label{compare-fir}	293	292
\end{figure}	294	293	The MILP solver provides a solution to the problem by selecting a series of small FIR with
	295	294	increasing number of bits representing data and coefficients as well as an increasing number
Fig. \ref{compare-fir} exhibits the	296	295	of coefficients, instead of a single monolithic filter.
performance comparison between one solution and a monolithic FIR when selecting a cutoff	297	296
frequency of half the Nyquist frequency: a series of 5 FIR and a series of 10 FIR with the	298	297	\begin{figure}[h!tb]
same space usage are provided as selected by the MILP solver. The FIR cascade provides improved	299	298	% \includegraphics[width=\linewidth]{images/compare-fir.pdf}
rejection than the monolithic FIR at the expense of a lower cutoff frequency which remains to	300	299	\includegraphics[width=\linewidth]{images/fir-mono-vs-fir-series-noise-fixe-jmf-light.pdf}
be tuned or compensated for.	301	300	\caption{Comparison of the rejection capability between a series of FIR and a monolithic FIR
	302	301	with a cutoff frequency set at half the Nyquist frequency.}
	303	302	\label{compare-fir}
The resource occupation when synthesizing such FIR on a Xilinx FPGA is summarized as Tab. \ref{t1}.	304	303	\end{figure}
We have considered a set of resources representative of the hardware platform we work on,	305	304
Avnet's Zedboard featuring a Xilinx XC7Z020-CLG484-1 Zynq System on Chip (SoC). The results on	306	305	Fig. \ref{compare-fir} exhibits the
Tab. \ref{t1} emphasize that implementing the monolithic single FIR is impossible due to	307	306	performance comparison between one solution and a monolithic FIR when selecting a cutoff
the insufficient hardware resources (exhausted LUT resources), while the FIR cascading 5 or 10	308	307	frequency of half the Nyquist frequency: a series of 5 FIR and a series of 10 FIR with the
filters fit in the available resources. However, in all cases the DSP resources are fully	309	308	same space usage are provided as selected by the MILP solver. The FIR cascade provides improved
used: while the design can be synthesized using Xilinx proprietary Vivado 2016.2 software,	310	309	rejection than the monolithic FIR at the expense of a lower cutoff frequency which remains to
implementing the design fails due to the excessive resource usage preventing routing the signals	311	310	be tuned or compensated for.
on the FPGA. Such results emphasize on the one hand the improvement prospect of the optimization	312	311
procedure by finding non-trivial solutions matching resource constraints, but on the other	313	312
hand also illustrates the limitation of a model with an abstraction layer that does not account	314	313	The resource occupation when synthesizing such FIR on a Xilinx FPGA is summarized as Tab. \ref{t1}.
for the detailed architecture of the hardware.	315	314	We have considered a set of resources representative of the hardware platform we work on,
	316	315	Avnet's Zedboard featuring a Xilinx XC7Z020-CLG484-1 Zynq System on Chip (SoC). The results reported in
\begin{table}[h!tb]	317	316	Tab. \ref{t1} emphasize that implementing the monolithic single FIR is impossible due to
\caption{Resource occupation on a Xilinx Zynq-7000 series FPGA when synthesizing the FIR cascade	318	317	the insufficient hardware resources (exhausted LUT resources), while the FIR cascading 5 or 10
identified as optimal by the MILP solver within a finite resource criterion. The last line refers	319	318	filters fit in the available resources. However, in all cases the DSP resources are fully
to available resources on a Zynq-7020 as found on the Zedboard.}	320	319	used: while the design can be synthesized using Xilinx proprietary Vivado 2016.2 software,
\begin{center}	321	320	implementing the design fails due to the excessive resource usage preventing routing the signals
\begin{tabular}{\|c\|cccc\|}\hline	322	321	on the FPGA. Such results emphasize on the one hand the improvement prospect of the optimization
FIR & BlockRAM & LookUpTables & DSP & rejection (dB)\\\hline\hline	323	322	procedure by finding non-trivial solutions matching resource constraints, but on the other
1 (monolithic) & 1 & 76183 & 220 & -162 \\	324	323	hand also illustrates the limitation of a model with an abstraction layer that does not account
5 & 5 & 18597 & 220 & -160 \\	325	324	for the detailed architecture of the hardware.
10 & 8 & 24729 & 220 & -161 \\\hline\hline	326	325
\textbf{Zynq 7020} & \textbf{420} & \textbf{53200} & \textbf{220} & \\\hline	327	326	\begin{table}[h!tb]
%\begin{tabular}{\|c\|ccccc\|}\hline	328	327	\caption{Resource occupation on a Xilinx Zynq-7000 series FPGA when synthesizing the FIR cascade
%FIR & BRAM36 & BRAM18 & LUT & DSP & rejection (dB)\\\hline\hline	329	328	identified as optimal by the MILP solver within a finite resource criterion. The last line refers
%1 (monolithic) & 1 & 0 & {\color{Red}76183} & 220 & -162 \\	330	329	to available resources on a Zynq-7020 as found on the Zedboard.}
%5 & 0 & 5 & {\color{Green}18597} & 220 & -160 \\	331	330	\begin{center}
%10 & 0 & 8 & {\color{Green}24729} & 220 & -161 \\\hline\hline	332	331	\begin{tabular}{\|c\|cccc\|}\hline
%\textbf{Zynq 7020} & \textbf{140} & \textbf{280} & \textbf{53200} & \textbf{220} & \\\hline	333	332	FIR & BlockRAM & LookUpTables & DSP & rejection (dB)\\\hline\hline
\end{tabular}	334	333	1 (monolithic) & 1 & 76183 & 220 & -162 \\
\end{center}	335	334	5 & 5 & 18597 & 220 & -160 \\
%\vspace{-0.7cm}	336	335	10 & 8 & 24729 & 220 & -161 \\\hline\hline
\label{t1}	337	336	\textbf{Zynq 7020} & \textbf{420} & \textbf{53200} & \textbf{220} & \\\hline
\end{table}	338	337	%\begin{tabular}{\|c\|ccccc\|}\hline
	339	338	%FIR & BRAM36 & BRAM18 & LUT & DSP & rejection (dB)\\\hline\hline
\subsection{Alternate criteria}\label{median}	340	339	%1 (monolithic) & 1 & 0 & {\color{Red}76183} & 220 & -162 \\
	341	340	%5 & 0 & 5 & {\color{Green}18597} & 220 & -160 \\
Fig. \ref{compare-fir} provides FIR solutions matching well the targeted transfer	342	341	%10 & 0 & 8 & {\color{Green}24729} & 220 & -161 \\\hline\hline
function, namely low ripple in the bandpass defined as the first 40\% of the frequency	343	342	%\textbf{Zynq 7020} & \textbf{140} & \textbf{280} & \textbf{53200} & \textbf{220} & \\\hline
range and maximum rejection of 160~dB in the last 40\% stopband. We illustrate now, for	344	343	\end{tabular}
demonstrating the need to properly select the optimization criterion, two cases of poor	345	344	\end{center}
filter shapes obtained by selecting the mean value and median value of the rejection,	346	345	%\vspace{-0.7cm}
with no consideration for the ripples in the bandpass. The results of the optimizations,	347	346	\label{t1}
in these cases, are shown in Figs. \ref{compare-mean} and \ref{compare-median}.	348	347	\end{table}
	349	348
\begin{figure}[h!tb]	350	349	\subsection{Alternate criteria}\label{median}
\includegraphics[width=\linewidth]{images/fir-mono-vs-fir-series-noise-fixe-mean-light.pdf}	351	350
\caption{Comparison of the rejection capability between a series of FIR and a monolithic FIR	352	351	Fig. \ref{compare-fir} provides FIR solutions matching well the targeted transfer
with a cutoff frequency set at half the Nyquist frequency.}	353	352	function, namely low ripple in the bandpass defined as the first 40\% of the frequency
\label{compare-mean}	354	353	range and maximum rejection of 160~dB in the last 40\% stopband. We illustrate now, for
\end{figure}	355	354	demonstrating the need to properly select the optimization criterion, two cases of poor
	356	355	filter shapes obtained by selecting the mean value and median value of the rejection,
In the case of the mean value criterion (Fig. \ref{compare-mean}), the solution is not	357	356	with no consideration for the ripples in the bandpass. The results of the optimizations,
acceptable since the notch at the end of the transition band compensates for some unacceptable	358	357	in these cases, are shown in Figs. \ref{compare-mean} and \ref{compare-median}.
rise in the rejection close to the Nyquist frequency. Applying such a filter might yield excessive	359	358
high frequency spurious components to be aliased at low frequency when decimating the signal.	360	359	\begin{figure}[h!tb]
Similarly, the lack of criterion on the bandpass shape induces a shape with poor flatness and	361	360	\includegraphics[width=\linewidth]{images/fir-mono-vs-fir-series-noise-fixe-mean-light.pdf}
and slowly decaying transfer function starting to attenuate spectral components well before the	362	361	\caption{Comparison of the rejection capability between a series of FIR and a monolithic FIR
transition band starts. Such issues are partly aleviated by replacing a mean rejection value with	363	362	with a cutoff frequency set at half the Nyquist frequency.}
a median rejection value (Fig. \ref{compare-median}) but solutions remain unacceptable for	364	363	\label{compare-mean}
the reasons stated previously and much poorer than those found with the maximum rejection criterion	365	364	\end{figure}
selected earlier (Fig. \ref{compare-fir}).	366	365
	367	366	In the case of the mean value criterion (Fig. \ref{compare-mean}), the solution is not
\begin{figure}[h!tb]	368	367	acceptable since the notch at the end of the transition band compensates for some unacceptable
\includegraphics[width=\linewidth]{images/fir-mono-vs-fir-series-noise-fixe-median-light.pdf}	369	368	rise in the rejection close to the Nyquist frequency. Applying such a filter might yield excessive
\caption{Comparison of the rejection capability between a series of FIR and a monolithic FIR	370	369	high frequency spurious components to be aliased at low frequency when decimating the signal.
with a cutoff frequency set at half the Nyquist frequency.}	371	370	Similarly, the lack of criterion on the bandpass shape induces a shape with poor flatness and
\label{compare-median}	372	371	and slowly decaying transfer function starting to attenuate spectral components well before the
\end{figure}	373	372	transition band starts. Such issues are partly aleviated by replacing a mean rejection value with
	374	373	a median rejection value (Fig. \ref{compare-median}) but solutions remain unacceptable for
\section{Filter coefficient selection}	375	374	the reasons stated previously and much poorer than those found with the maximum rejection criterion
	376	375	selected earlier (Fig. \ref{compare-fir}).
The coefficients of a single monolithic filter are computed as the impulse response	377	376
of the filter transfer function, and practically approximated by a multitude of methods	378	377	\begin{figure}[h!tb]
including least square optimization (Matlab's {\tt firls} function), Hamming or Kaiser windowing	379	378	\includegraphics[width=\linewidth]{images/fir-mono-vs-fir-series-noise-fixe-median-light.pdf}
(Matlab's {\tt fir1} function).	380	379	\caption{Comparison of the rejection capability between a series of FIR and a monolithic FIR
	381	380	with a cutoff frequency set at half the Nyquist frequency.}
\begin{figure}[h!tb]	382	381	\label{compare-median}
\includegraphics[width=\linewidth]{images/fir1-vs-firls}	383	382	\end{figure}
\caption{Evolution of the rejection capability of least-square optimized filters and Hamming	384	383
FIR filters as a function of the number of coefficients, for floating point numbers and 8-bit	385	384	\section{Filter coefficient selection}
encoded integers.}	386	385
\label{2}	387	386	The coefficients of a single monolithic filter are computed as the impulse response
\end{figure}	388	387	of the filter transfer function, and practically approximated by a multitude of methods
	389	388	including least square optimization (Matlab's {\tt firls} function), Hamming or Kaiser windowing
Cascading filters opens a new optimization opportunity by	390	389	(Matlab's {\tt fir1} function).
selecting various coefficient sets depending on the number of coefficients. Fig. \ref{2}	391	390
illustrates that for a number of coefficients ranging from 8 to 47, {\tt fir1} provides a better	392	391	\begin{figure}[h!tb]
rejection than {\tt firls}: since the linear solver increases the number of coefficients along	393	392	\includegraphics[width=\linewidth]{images/fir1-vs-firls}
the processing chain, the type of selected filter also changes depending on the number of coefficients	394	393	\caption{Evolution of the rejection capability of least-square optimized filters and Hamming
and evolves along the processing chain.	395	394	FIR filters as a function of the number of coefficients, for floating point numbers and 8-bit
	396	395	encoded integers.}
\section{Conclusion}	397	396	\label{2}
	398	397	\end{figure}
We address the optimization problem of designing a low-pass filter chain in a Field Programmable Gate	399	398
Array for improved noise rejection within constrained resource occupation, as needed for	400	399	Cascading filters opens a new optimization opportunity by
real time processing of radiofrequency signal when characterizing spectral phase noise	401	400	selecting various coefficient sets depending on the number of coefficients. Fig. \ref{2}
characteristics of stable oscillators. The flexibility of the digital approach makes the result	402	401	illustrates that for a number of coefficients ranging from 8 to 47, {\tt fir1} provides a better
best suited for closing the loop and using the measurement output in a feedback loop for	403	402	rejection than {\tt firls}: since the linear solver increases the number of coefficients along
controlling clocks, e.g. in a quartz-stabilized high performance clock whose long term behavior	404	403	the processing chain, the type of selected filter also changes depending on the number of coefficients
is controlled by non-piezoelectric resonator (sapphire resonator, microwave or optical	405	404	and evolves along the processing chain.
atomic transition).	406	405
	407	406	\section{Conclusion}
\section*{Acknowledgement}	408	407
	409	408	We address the optimization problem of designing a low-pass filter chain in a Field Programmable Gate
This work is supported by the ANR Programme d'Investissement d'Avenir in	410	409	Array for improved noise rejection within constrained resource occupation, as needed for
progress at the Time and Frequency Departments of the FEMTO-ST Institute	411	410	real time processing of radiofrequency signal when characterizing spectral phase noise
(Oscillator IMP, First-TF and Refimeve+), and by R\'egion de Franche-Comt\'e.	412	411	characteristics of stable oscillators. The flexibility of the digital approach makes the result
The authors would like to thank E. Rubiola, F. Vernotte, G. Cabodevila for support and	413	412	best suited for closing the loop and using the measurement output in a feedback loop for
fruitful discussions.	414	413	controlling clocks, e.g. in a quartz-stabilized high performance clock whose long term behavior
	415	414	is controlled by non-piezoelectric resonator (sapphire resonator, microwave or optical
\bibliographystyle{IEEEtran}	416	415	atomic transition).
\balance	417	416
\bibliography{references,biblio}	418	417	\section*{Acknowledgement}
\end{document}	419	418
	420	419	This work is supported by the ANR Programme d'Investissement d'Avenir in
\section{Contexte d'ordonnancement}	421	420	progress at the Time and Frequency Departments of the FEMTO-ST Institute
Dans cette partie, nous donnerons des d\'efinitions de termes rattach\'es au domaine de l'ordonnancement	422	421	(Oscillator IMP, First-TF and Refimeve+), and by R\'egion de Franche-Comt\'e.
et nous verrons que le sujet trait\'e se rapproche beaucoup d'un problème d'ordonnancement. De ce fait	423	422	The authors would like to thank E. Rubiola, F. Vernotte, G. Cabodevila for support and
nous pourrons aller plus loin que les travaux vus pr\'ec\'edemment et nous tenterons des approches d'ordonnancement	424	423	fruitful discussions.
et d'optimisation.	425	424
	426	425	\bibliographystyle{IEEEtran}
\subsection{D\'efinition du vocabulaire}	427	426	\balance
Avant tout, il faut d\'efinir ce qu'est un problème d'optimisation. Il y a deux d\'efinitions	428	427	\bibliography{references,biblio}
importantes à donner. La première est propos\'ee par Legrand et Robert dans leur livre \cite{def1-ordo} :	429	428	\end{document}
\begin{definition}	430	429
\label{def-ordo1}	431	430	\section{Contexte d'ordonnancement}
Un ordonnancement d'un système de t\^aches $G\ =\ (V,\ E,\ w)$ est une fonction $\sigma$ :	432	431	Dans cette partie, nous donnerons des d\'efinitions de termes rattach\'es au domaine de l'ordonnancement
$V \rightarrow \mathbb{N}$ telle que $\sigma(u) + w(u) \leq \sigma(v)$ pour toute arête $(u,\ v) \in E$.	433	432	et nous verrons que le sujet trait\'e se rapproche beaucoup d'un problème d'ordonnancement. De ce fait
\end{definition}	434	433	nous pourrons aller plus loin que les travaux vus pr\'ec\'edemment et nous tenterons des approches d'ordonnancement
	435	434	et d'optimisation.
Dit plus simplement, l'ensemble $V$ repr\'esente les t\^aches à ex\'ecuter, l'ensemble $E$ repr\'esente les d\'ependances	436	435
des t\^aches et $w$ les temps d'ex\'ecution de la t\^ache. La fonction $\sigma$ donne donc l'heure de d\'ebut de	437	436	\subsection{D\'efinition du vocabulaire}
chacune des t\^aches. La d\'efinition dit que si une t\^ache $v$ d\'epend d'une t\^ache $u$ alors	438	437	Avant tout, il faut d\'efinir ce qu'est un problème d'optimisation. Il y a deux d\'efinitions
la date de d\'ebut de $v$ sera plus grande ou \'egale au d\'ebut de l'ex\'ecution de la t\^ache $u$ plus son	439	438	importantes à donner. La première est propos\'ee par Legrand et Robert dans leur livre \cite{def1-ordo} :
temps d'ex\'ecution.	440	439	\begin{definition}
	441	440	\label{def-ordo1}
Une autre d\'efinition importante qui est propos\'ee par Leung et al. \cite{def2-ordo} est :	442	441	Un ordonnancement d'un système de t\^aches $G\ =\ (V,\ E,\ w)$ est une fonction $\sigma$ :
\begin{definition}	443	442	$V \rightarrow \mathbb{N}$ telle que $\sigma(u) + w(u) \leq \sigma(v)$ pour toute arête $(u,\ v) \in E$.
\label{def-ordo2}	444	443	\end{definition}
L'ordonnancement traite de l'allocation de ressources rares à des activit\'es avec	445	444
l'objectif d'optimiser un ou plusieurs critères de performance.	446	445	Dit plus simplement, l'ensemble $V$ repr\'esente les t\^aches à ex\'ecuter, l'ensemble $E$ repr\'esente les d\'ependances
\end{definition}	447	446	des t\^aches et $w$ les temps d'ex\'ecution de la t\^ache. La fonction $\sigma$ donne donc l'heure de d\'ebut de
	448	447	chacune des t\^aches. La d\'efinition dit que si une t\^ache $v$ d\'epend d'une t\^ache $u$ alors
Cette d\'efinition est plus g\'en\'erique mais elle nous int\'eresse d'avantage que la d\'efinition \ref{def-ordo1}.	449	448	la date de d\'ebut de $v$ sera plus grande ou \'egale au d\'ebut de l'ex\'ecution de la t\^ache $u$ plus son
En effet, la partie qui nous int\'eresse dans cette première d\'efinition est le respect de la pr\'ec\'edance des t\^aches.	450	449	temps d'ex\'ecution.
Dans les faits les dates de d\'ebut ne nous int\'eressent pas r\'eellement.	451	450
	452	451	Une autre d\'efinition importante qui est propos\'ee par Leung et al. \cite{def2-ordo} est :
En revanche la d\'efinition \ref{def-ordo2} sera au c\oe{}ur du projet. Pour se convaincre de cela,	453	452	\begin{definition}
il nous faut d'abord d\'efinir quel est le type de problème d'ordonnancement qu'on traite et quelles	454	453	\label{def-ordo2}
sont les m\'ethodes qu'on peut appliquer.	455	454	L'ordonnancement traite de l'allocation de ressources rares à des activit\'es avec
	456	455	l'objectif d'optimiser un ou plusieurs critères de performance.
Les problèmes d'ordonnancement peuvent être class\'es en diff\'erentes cat\'egories :	457	456	\end{definition}
\begin{itemize}	458	457
\item T\^aches ind\'ependantes : dans cette cat\'egorie de problèmes, les t\^aches sont complètement ind\'ependantes	459	458	Cette d\'efinition est plus g\'en\'erique mais elle nous int\'eresse d'avantage que la d\'efinition \ref{def-ordo1}.
les unes des autres. Dans notre cas, ce n'est pas le plus adapt\'e.	460	459	En effet, la partie qui nous int\'eresse dans cette première d\'efinition est le respect de la pr\'ec\'edance des t\^aches.
\item Graphe de t\^aches : la d\'efinition \ref{def-ordo1} d\'ecrit cette cat\'egorie. La plupart du temps,	461	460	Dans les faits les dates de d\'ebut ne nous int\'eressent pas r\'eellement.
les t\^aches sont repr\'esent\'ees par une DAG. Cette cat\'egorie est très proche de notre cas puisque nous devons \'egalement ex\'ecuter	462	461
des t\^aches qui ont un certain nombre de d\'ependances. On pourra même dire que dans certain cas,	463	462	En revanche la d\'efinition \ref{def-ordo2} sera au c\oe{}ur du projet. Pour se convaincre de cela,
on a des anti-arbres, c'est à dire que nous avons une multitude de t\^aches d'entr\'ees qui convergent vers une	464	463	il nous faut d'abord d\'efinir quel est le type de problème d'ordonnancement qu'on traite et quelles
t\^ache de fin.	465	464	sont les m\'ethodes qu'on peut appliquer.
\item Workflow : cette cat\'egorie est une sous cat\'egorie des graphes de t\^aches dans le sens où	466	465
il s'agit d'un graphe de t\^aches r\'ep\'et\'e de nombreuses de fois. C'est exactement ce type de problème	467	466	Les problèmes d'ordonnancement peuvent être class\'es en diff\'erentes cat\'egories :
que nous traitons ici.	468	467	\begin{itemize}
\end{itemize}	469	468	\item T\^aches ind\'ependantes : dans cette cat\'egorie de problèmes, les t\^aches sont complètement ind\'ependantes
	470	469	les unes des autres. Dans notre cas, ce n'est pas le plus adapt\'e.
Bien entendu, cette liste n'est pas exhaustive et il existe de nombreuses autres classifications et sous-classifications	471	470	\item Graphe de t\^aches : la d\'efinition \ref{def-ordo1} d\'ecrit cette cat\'egorie. La plupart du temps,
de ces problèmes. Nous n'avons parl\'e ici que des cat\'egories les plus communes.	472	471	les t\^aches sont repr\'esent\'ees par une DAG. Cette cat\'egorie est très proche de notre cas puisque nous devons \'egalement ex\'ecuter
	473	472	des t\^aches qui ont un certain nombre de d\'ependances. On pourra même dire que dans certain cas,
Un autre point à d\'efinir, est le critère d'optimisation. Il y a là encore un grand nombre de	474	473	on a des anti-arbres, c'est à dire que nous avons une multitude de t\^aches d'entr\'ees qui convergent vers une
critères possibles. Nous allons donc parler des principaux :	475	474	t\^ache de fin.
\begin{itemize}	476	475	\item Workflow : cette cat\'egorie est une sous cat\'egorie des graphes de t\^aches dans le sens où
\item Temps de compl\'etion total (ou Makespan en anglais) : ce critère est l'un des critères d'optimisation	477	476	il s'agit d'un graphe de t\^aches r\'ep\'et\'e de nombreuses de fois. C'est exactement ce type de problème
les plus courant. Il s'agit donc de minimiser la date de fin de la dernière t\^ache de l'ensemble des	478	477	que nous traitons ici.
t\^aches à ex\'ecuter. L'enjeu de cette optimisation est donc de trouver l'ordonnancement optimal permettant	479	478	\end{itemize}
la fin d'ex\'ecution au plus tôt.	480	479
\item Somme des temps d'ex\'ecution (Flowtime en anglais) : il s'agit de faire la somme des temps d'ex\'ecution de toutes les t\^aches	481	480	Bien entendu, cette liste n'est pas exhaustive et il existe de nombreuses autres classifications et sous-classifications
et d'optimiser ce r\'esultat.	482	481	de ces problèmes. Nous n'avons parl\'e ici que des cat\'egories les plus communes.
\item Le d\'ebit : ce critère quant à lui, vise à augmenter au maximum le d\'ebit de traitement des donn\'ees.	483	482
\end{itemize}	484	483	Un autre point à d\'efinir, est le critère d'optimisation. Il y a là encore un grand nombre de
	485	484	critères possibles. Nous allons donc parler des principaux :
En plus de cela, on peut avoir besoin de plusieurs critères d'optimisation. Il s'agit dans ce cas d'une optimisation	486	485	\begin{itemize}
multi-critères. Bien entendu, cela complexifie d'autant plus le problème car la solution la plus optimale pour un	487	486	\item Temps de compl\'etion total (ou Makespan en anglais) : ce critère est l'un des critères d'optimisation
des critères peut être très mauvaise pour un autre critère. De ce cas, il s'agira de trouver une solution qui permet	488	487	les plus courant. Il s'agit donc de minimiser la date de fin de la dernière t\^ache de l'ensemble des
de faire le meilleur compromis entre tous les critères.	489	488	t\^aches à ex\'ecuter. L'enjeu de cette optimisation est donc de trouver l'ordonnancement optimal permettant
	490	489	la fin d'ex\'ecution au plus tôt.
\subsection{Formalisation du problème}	491	490	\item Somme des temps d'ex\'ecution (Flowtime en anglais) : il s'agit de faire la somme des temps d'ex\'ecution de toutes les t\^aches
\label{formalisation}	492	491	et d'optimiser ce r\'esultat.
Maintenant que nous avons donn\'e le vocabulaire li\'e à l'ordonnancement, nous allons pouvoir essayer caract\'eriser	493	492	\item Le d\'ebit : ce critère quant à lui, vise à augmenter au maximum le d\'ebit de traitement des donn\'ees.
formellement notre problème. En effet, nous allons reprendre les contraintes \'enonc\'ees dans la sections \ref{def-contraintes}	494	493	\end{itemize}
et nous essayerons de les formaliser le plus finement possible.	495	494
	496	495	En plus de cela, on peut avoir besoin de plusieurs critères d'optimisation. Il s'agit dans ce cas d'une optimisation
Comme nous l'avons dit, une t\^ache est un bloc de traitement. Chaque t\^ache $i$ dispose d'un ensemble de paramètres	497	496	multi-critères. Bien entendu, cela complexifie d'autant plus le problème car la solution la plus optimale pour un
que nous nommerons $\mathcal{P}_{i}$. Cet ensemble $\mathcal{P}_i$ est propre à chaque t\^ache et il variera d'une	498	497	des critères peut être très mauvaise pour un autre critère. De ce cas, il s'agira de trouver une solution qui permet
t\^ache à l'autre. Nous reviendrons plus tard sur les paramètres qui peuvent composer cet ensemble.	499	498	de faire le meilleur compromis entre tous les critères.
	500	499
Outre cet ensemble $\mathcal{P}_i$, chaque t\^ache dispose de paramètres communs :	501	500	\subsection{Formalisation du problème}
\begin{itemize}	502	501	\label{formalisation}
\item Dur\'ee de la t\^ache : Comme nous l'avons dit auparavant, dans le cadre d'un FPGA le temps est compt\'e en nombre de coup d'horloge.	503	502	Maintenant que nous avons donn\'e le vocabulaire li\'e à l'ordonnancement, nous allons pouvoir essayer caract\'eriser
En outre, les blocs sont toujours sollicit\'es, certains même sont capables de lire et de renvoyer une r\'esultat à chaque coups d'horloge.	504	503	formellement notre problème. En effet, nous allons reprendre les contraintes \'enonc\'ees dans la sections \ref{def-contraintes}
Donc la dur\'ee d'une t\^ache ne peut être le laps de temps entre l'entr\'ee d'une donn\'ee et la sortie d'une autre. Nous d\'efinirons la	505	504	et nous essayerons de les formaliser le plus finement possible.
dur\'ee comme le temps de traitement d'une donn\'ee, c'est à dire la diff\'erence de temps entre la date de sortie d'une donn\'ee	506	505
et de sa date d'entr\'ee. Nous nommerons cette dur\'ee $\delta_i$. % Je devrais la nomm\'ee w comme dans la def2	507	506	Comme nous l'avons dit, une t\^ache est un bloc de traitement. Chaque t\^ache $i$ dispose d'un ensemble de paramètres
\item La pr\'ecision : La pr\'ecision d'une donn\'ee est le nombre de bits significatifs qu'elle compte. En effet, au fil des traitements	508	507	que nous nommerons $\mathcal{P}_{i}$. Cet ensemble $\mathcal{P}_i$ est propre à chaque t\^ache et il variera d'une
les pr\'ecisions peuvent varier. On nomme donc la pr\'ecision d'entr\'ee d'une t\^ache $i$ comme $\pi_i^-$ et la pr\'ecision en sortie $\pi_i^+$.	509	508	t\^ache à l'autre. Nous reviendrons plus tard sur les paramètres qui peuvent composer cet ensemble.
\item La fr\'equence du flux en entr\'ee (ou sortie) : Cette fr\'equence repr\'esente la fr\'equence des donn\'ees qui arrivent (resp. sortent).	510	509
Selon les t\^aches, les fr\'equences varieront. En effet, certains blocs ralentissent le flux c'est pourquoi on distingue la fr\'equence du	511	510	Outre cet ensemble $\mathcal{P}_i$, chaque t\^ache dispose de paramètres communs :
flux en entr\'ee et la fr\'equence en sortie. Nous nommerons donc la fr\'equence du flux en entr\'ee $f_i^-$ et la fr\'equence en sortie $f_i^+$.	512	511	\begin{itemize}
\item La quantit\'e de donn\'ees en entr\'ee (ou en sortie) : Il s'agit de la quantit\'e de donn\'ees que le bloc s'attend à traiter (resp.	513	512	\item Dur\'ee de la t\^ache : Comme nous l'avons dit auparavant, dans le cadre d'un FPGA le temps est compt\'e en nombre de coup d'horloge.
est capable de produire). Les t\^aches peuvent avoir à traiter des gros volumes de donn\'ees et n'en ressortir qu'une partie. Cette	514	513	En outre, les blocs sont toujours sollicit\'es, certains même sont capables de lire et de renvoyer une r\'esultat à chaque coups d'horloge.
fois encore, il nous faut donc diff\'erencier l'entr\'ee et la sortie. Nous nommerons donc la quantit\'e de donn\'ees entrantes $q_i^-$	515	514	Donc la dur\'ee d'une t\^ache ne peut être le laps de temps entre l'entr\'ee d'une donn\'ee et la sortie d'une autre. Nous d\'efinirons la
et la quantit\'e de donn\'ees sortantes $q_i^+$ pour une t\^ache $i$.	516	515	dur\'ee comme le temps de traitement d'une donn\'ee, c'est à dire la diff\'erence de temps entre la date de sortie d'une donn\'ee
\item Le d\'ebit d'entr\'ee (ou de sortie) : Ce paramètre correspond au d\'ebit de donn\'ees que la t\^ache est capable de traiter ou qu'elle	517	516	et de sa date d'entr\'ee. Nous nommerons cette dur\'ee $\delta_i$. % Je devrais la nomm\'ee w comme dans la def2
fournit en sortie. Il s'agit simplement de l'expression des deux pr\'ec\'edents paramètres. Nous d\'efinirons donc la d\'ebit entrant de la	518	517	\item La pr\'ecision : La pr\'ecision d'une donn\'ee est le nombre de bits significatifs qu'elle compte. En effet, au fil des traitements
t\^ache $i$ comme $d_i^-\ =\ q_i^-\ \ f_i^-$ et le d\'ebit sortant comme $d_i^+\ =\ q_i^+\ \ f_i^+$.	519	518	les pr\'ecisions peuvent varier. On nomme donc la pr\'ecision d'entr\'ee d'une t\^ache $i$ comme $\pi_i^-$ et la pr\'ecision en sortie $\pi_i^+$.
\item La taille de la t\^ache : La taille dans les FPGA \'etant limit\'ee, ce paramètre exprime donc la place qu'occupe la t\^ache au sein du bloc.	520	519	\item La fr\'equence du flux en entr\'ee (ou sortie) : Cette fr\'equence repr\'esente la fr\'equence des donn\'ees qui arrivent (resp. sortent).
Nous nommerons $\mathcal{A}_i$ cette taille.	521	520	Selon les t\^aches, les fr\'equences varieront. En effet, certains blocs ralentissent le flux c'est pourquoi on distingue la fr\'equence du
\item Les pr\'ed\'ecesseurs et successeurs d'une t\^ache : cela nous permet de connaître les t\^aches requises pour pouvoir traiter	522	521	flux en entr\'ee et la fr\'equence en sortie. Nous nommerons donc la fr\'equence du flux en entr\'ee $f_i^-$ et la fr\'equence en sortie $f_i^+$.
la t\^ache $i$ ainsi que les t\^aches qui en d\'ependent. Ces ensemble sont not\'es $\Gamma _i ^-$ et $ \Gamma _i ^+$ \\	523	522	\item La quantit\'e de donn\'ees en entr\'ee (ou en sortie) : Il s'agit de la quantit\'e de donn\'ees que le bloc s'attend à traiter (resp.
%TODO Est-ce vraiment un paramètre ?	524	523	est capable de produire). Les t\^aches peuvent avoir à traiter des gros volumes de donn\'ees et n'en ressortir qu'une partie. Cette
\end{itemize}	525	524	fois encore, il nous faut donc diff\'erencier l'entr\'ee et la sortie. Nous nommerons donc la quantit\'e de donn\'ees entrantes $q_i^-$
	526	525	et la quantit\'e de donn\'ees sortantes $q_i^+$ pour une t\^ache $i$.
Ces diff\'erents paramètres communs sont fortement li\'es aux \'el\'ements de $\mathcal{P}_i$. Voici quelques exemples de relations	527	526	\item Le d\'ebit d'entr\'ee (ou de sortie) : Ce paramètre correspond au d\'ebit de donn\'ees que la t\^ache est capable de traiter ou qu'elle
que nous avons identifi\'ees :	528	527	fournit en sortie. Il s'agit simplement de l'expression des deux pr\'ec\'edents paramètres. Nous d\'efinirons donc la d\'ebit entrant de la
\begin{itemize}	529	528	t\^ache $i$ comme $d_i^-\ =\ q_i^-\ \ f_i^-$ et le d\'ebit sortant comme $d_i^+\ =\ q_i^+\ \ f_i^+$.
\item $ \delta _i ^+ \ = \ \mathcal{F}_{\delta}(\pi_i^-,\ \pi_i^+,\ d_i^-,\ d_i^+,\ \mathcal{P}_i) $ donne le temps d'ex\'ecution	530	529	\item La taille de la t\^ache : La taille dans les FPGA \'etant limit\'ee, ce paramètre exprime donc la place qu'occupe la t\^ache au sein du bloc.
de la t\^ache en fonction de la pr\'ecision voulue, du d\'ebit et des paramètres internes.	531	530	Nous nommerons $\mathcal{A}_i$ cette taille.
\item $ \pi _i ^+ \ = \ \mathcal{F}_{p}(\pi_i^-,\ \mathcal{P}_i) $, la fonction $F_p$ donne la pr\'ecision en sortie selon la pr\'ecision de d\'epart	532	531	\item Les pr\'ed\'ecesseurs et successeurs d'une t\^ache : cela nous permet de connaître les t\^aches requises pour pouvoir traiter
et les paramètres internes de la t\^ache.	533	532	la t\^ache $i$ ainsi que les t\^aches qui en d\'ependent. Ces ensemble sont not\'es $\Gamma _i ^-$ et $ \Gamma _i ^+$ \\
\item $d_i^+\ =\ \mathcal{F}_d(d_i^-, \mathcal{P}_i)$, la fonction $F_d$ donne le d\'ebit sortant de la t\^ache en fonction du d\'ebit	534	533	%TODO Est-ce vraiment un paramètre ?
sortant et des variables internes de la t\^ache.	535	534	\end{itemize}
\item $A_i^+\ =\ \mathcal{F}_A(\pi_i^-,\ \pi_i^+,\ d_i^-,\ d_i^+, \mathcal{P}_i)$	536	535
\end{itemize}	537	536	Ces diff\'erents paramètres communs sont fortement li\'es aux \'el\'ements de $\mathcal{P}_i$. Voici quelques exemples de relations
Pour le moment, nous ne sommes pas capables de donner une d\'efinition g\'en\'erale de ces fonctions. Mais en revanche,	538	537	que nous avons identifi\'ees :
sur quelques exemples simples (cf. \ref{def-contraintes}), nous parvenons à donner une \'evaluation de ces fonctions.	539	538	\begin{itemize}
	540	539	\item $ \delta _i ^+ \ = \ \mathcal{F}_{\delta}(\pi_i^-,\ \pi_i^+,\ d_i^-,\ d_i^+,\ \mathcal{P}_i) $ donne le temps d'ex\'ecution
Maintenant que nous avons donn\'e toutes les notations utiles, nous allons \'enoncer des contraintes relatives à notre problème. Soit	541	540	de la t\^ache en fonction de la pr\'ecision voulue, du d\'ebit et des paramètres internes.
un DGA $G(V,\ E)$, on a pour toutes arêtes $(i, j)\ \in\ E$ les in\'equations suivantes :	542	541	\item $ \pi _i ^+ \ = \ \mathcal{F}_{p}(\pi_i^-,\ \mathcal{P}_i) $, la fonction $F_p$ donne la pr\'ecision en sortie selon la pr\'ecision de d\'epart
	543	542	et les paramètres internes de la t\^ache.
\paragraph{Contrainte de pr\'ecision :}	544	543	\item $d_i^+\ =\ \mathcal{F}_d(d_i^-, \mathcal{P}_i)$, la fonction $F_d$ donne le d\'ebit sortant de la t\^ache en fonction du d\'ebit
Cette in\'equation traduit la contrainte de pr\'ecision d'une t\^ache à l'autre :	545	544	sortant et des variables internes de la t\^ache.
\begin{align*}	546	545	\item $A_i^+\ =\ \mathcal{F}_A(\pi_i^-,\ \pi_i^+,\ d_i^-,\ d_i^+, \mathcal{P}_i)$
\pi _i ^+ \geq \pi _j ^-	547	546	\end{itemize}
\end{align*}	548	547	Pour le moment, nous ne sommes pas capables de donner une d\'efinition g\'en\'erale de ces fonctions. Mais en revanche,
	549	548	sur quelques exemples simples (cf. \ref{def-contraintes}), nous parvenons à donner une \'evaluation de ces fonctions.
\paragraph{Contrainte de d\'ebit :}	550	549
Cette in\'equation traduit la contrainte de d\'ebit d'une t\^ache à l'autre :	551	550	Maintenant que nous avons donn\'e toutes les notations utiles, nous allons \'enoncer des contraintes relatives à notre problème. Soit
\begin{align*}	552	551	un DGA $G(V,\ E)$, on a pour toutes arêtes $(i, j)\ \in\ E$ les in\'equations suivantes :
d _i ^+ = q _j ^- * (f_i + (1 / s_j) ) & \text{ où } s_j \text{ est une valeur positive de temporisation de la t\^ache}	553	552
\end{align*}	554	553	\paragraph{Contrainte de pr\'ecision :}
	555	554	Cette in\'equation traduit la contrainte de pr\'ecision d'une t\^ache à l'autre :
\paragraph{Contrainte de synchronisation :}	556	555	\begin{align*}
Il s'agit de la contrainte qui impose que si à un moment du traitement, le DAG se s\'epare en plusieurs branches parallèles	557	556	\pi _i ^+ \geq \pi _j ^-
et qu'elles se rejoignent plus tard, la somme des latences sur chacune des branches soit la même.	558	557	\end{align*}
Plus formellement, s'il existe plusieurs chemins disjoints, partant de la t\^ache $s$ et allant à la t\^ache de $f$ alors :	559	558
\begin{align*}	560	559	\paragraph{Contrainte de d\'ebit :}
\forall \text{ chemin } \mathcal{C}1(s, .., f),	561	560	Cette in\'equation traduit la contrainte de d\'ebit d'une t\^ache à l'autre :
\forall \text{ chemin } \mathcal{C}2(s, .., f)	562	561	\begin{align*}
\text{ tel que } \mathcal{C}1 \neq \mathcal{C}2	563	562	d _i ^+ = q _j ^- * (f_i + (1 / s_j) ) & \text{ où } s_j \text{ est une valeur positive de temporisation de la t\^ache}
\Rightarrow	564	563	\end{align*}
\sum _{i} ^{i \in \mathcal{C}1} \delta_i = \sum _{i} ^{i \in \mathcal{C}2} \delta_i	565	564
\end{align*}	566	565	\paragraph{Contrainte de synchronisation :}
	567	566	Il s'agit de la contrainte qui impose que si à un moment du traitement, le DAG se s\'epare en plusieurs branches parallèles
\paragraph{Contrainte de place :}	568	567	et qu'elles se rejoignent plus tard, la somme des latences sur chacune des branches soit la même.
Cette in\'equation traduit la contrainte de place dans le FPGA. La taille max de la puce FPGA est nomm\'e $\mathcal{A}_{FPGA}$ :	569	568	Plus formellement, s'il existe plusieurs chemins disjoints, partant de la t\^ache $s$ et allant à la t\^ache de $f$ alors :
\begin{align*}	570	569	\begin{align*}
\sum ^{\text{t\^ache } i} \mathcal{A}_i \leq \mathcal{A}_{FPGA}	571	570	\forall \text{ chemin } \mathcal{C}1(s, .., f),
\end{align*}	572	571	\forall \text{ chemin } \mathcal{C}2(s, .., f)
	573	572	\text{ tel que } \mathcal{C}1 \neq \mathcal{C}2
\subsection{Exemples de mod\'elisation}	574	573	\Rightarrow
\label{exemples-modeles}	575	574	\sum _{i} ^{i \in \mathcal{C}1} \delta_i = \sum _{i} ^{i \in \mathcal{C}2} \delta_i
Nous allons maintenant prendre quelques blocs de traitement simples afin d'illustrer au mieux notre modèle.	576	575	\end{align*}
Pour tous nos exemple, nous prendrons un d\'ebit en entr\'ee de 200 Mo/s avec une pr\'ecision de 16 bit.	577	576
	578	577	\paragraph{Contrainte de place :}
Prenons tout d'abord l'exemple d'un bloc de d\'ecimation. Le but de ce bloc est de ralentir le flux en ne gardant	579	578	Cette in\'equation traduit la contrainte de place dans le FPGA. La taille max de la puce FPGA est nomm\'e $\mathcal{A}_{FPGA}$ :
que certaines donn\'ees à intervalle r\'egulier. Cet intervalle est appel\'e le facteur de d\'ecimation, on le notera $N$.	580	579	\begin{align*}
	581	580	\sum ^{\text{t\^ache } i} \mathcal{A}_i \leq \mathcal{A}_{FPGA}
Donc d'après notre mod\'elisation :	582	581	\end{align*}
\begin{itemize}	583	582
\item $N \in \mathcal{P}_i$	584	583	\subsection{Exemples de mod\'elisation}
%TODO N ou 1 ?	585	584	\label{exemples-modeles}