diff --git a/ifcs2018_journal.tex b/ifcs2018_journal.tex index 7312321..5d09450 100644 --- a/ifcs2018_journal.tex +++ b/ifcs2018_journal.tex @@ -116,7 +116,7 @@ not only the coefficient values and number of taps must be defined, but also the defining the coefficients and the sample size. For this reason, and because we consider pipeline processing (as opposed to First-In, First-Out FIFO memory batch processing) of radiofrequency signals, High Level Synthesis (HLS) languages \cite{kasbah2008multigrid} are not considered but -the problem is tackled at the Very-high-speed-integrated-circuit Hardware Description Language +the problem is tackled at the Very-high-speed-integrated-circuit Hardware Description Language (VHDL) level. Since latency is not an issue in a openloop phase noise characterization instrument, the large numbre of taps in the FIR, as opposed to the shorter Infinite Impulse Response (IIR) filter, @@ -156,8 +156,8 @@ moment. \section{Methodology description} -Our objective is to develop a new methodology applicable to any Digital Signal Processing (DSP) -chain obtained by assembling basic processing blocks, with hardware and manufacturer independence. +Our objective is to develop a new methodology applicable to any Digital Signal Processing (DSP) +chain obtained by assembling basic processing blocks, with hardware and manufacturer independence. Achieving such a target requires defining an abstract model to represent some basic properties of DSP blocks such as perfomance (i.e. rejection or ripples in the bandpass for filters) and resource occupation. These abstract properties, not necessarily related to the detailed hardware @@ -172,22 +172,22 @@ of the analysis. In this demonstration , we focus on only two operations: filtering and shifting the number of bits needed to represent the data along the processing chain. -We have chosen these basic operations because shifting and the filtering have already been studied +We have chosen these basic operations because shifting and the filtering have already been studied in the literature \cite{lim_1996, lim_1988, young_1992, smith_1998} providing a framework for -assessing our results. Furthermore, filtering is a core step in any radiofrequency frontend -requiring pipelined processing at full bandwidth for the earliest steps, including for +assessing our results. Furthermore, filtering is a core step in any radiofrequency frontend +requiring pipelined processing at full bandwidth for the earliest steps, including for time and frequency transfer or characterization \cite{carolina1,carolina2,rsi}. Addressing only two operations allows for demonstrating the methodology but should not be considered as a limitation of the framework which can be extended to assembling any number of skeleton blocks as long as perfomance and resource occupation can be determined. Hence, in this paper we will apply our methodology on simple DSP chains: a white noise input signal -is generated using a Pseudo-Random Number (PRN) generator or thanks at a radiofrequency-grade +is generated using a Pseudo-Random Number (PRN) generator or thanks at a radiofrequency-grade Analog to Digital Converter (ADC) loaded by a 50~$\Omega$ resistor. Once samples have been digitized at a rate of 125~MS/s, filtering is applied to qualify the processing block performance -- practically meeting the radiofrequency frontend requirement of noise and bandwidth reduction by filtering and decimating. Finally, bursts of filtered samples are stored for post-processing, -allowing to assess either filter rejection for a given resource usage, or validating the rejection +allowing to assess either filter rejection for a given resource usage, or validating the rejection when implementing a solution minimizing resource occupation. The first step of our approach is to model the DSP chain and since we just optimize @@ -238,7 +238,7 @@ At least one coefficient is coded on $\pi_i^C$~bits, and in practice only $b_{C_ With these coefficients, the \texttt{freqz} function is used to estimate the magnitude of the filter transfer function. Comparing the performance between FIRs requires however defining a unique criterion. As shown in figure~\ref{fig:fir_mag}, -the FIR magnitude exhibits two parts: we focus here on the transitions width and the rejection rather than on the +the FIR magnitude exhibits two parts: we focus here on the transitions width and the rejection rather than on the bandpass ripples as emphasized in \cite{lim_1988,lim_1996}. \begin{figure} @@ -280,7 +280,7 @@ the stopband the last 40\%, allowing 20\% transition width.} \end{figure} In the transition band, the behavior of the filter is left free, we only care about the passband and the stopband characteristics. -Our initial criterion considered the mean value of the stopband rejection, as shown in figure~\ref{fig:mean_criterion}. This criterion +Our initial criterion considered the mean value of the stopband rejection, as shown in figure~\ref{fig:mean_criterion}. This criterion yields unacceptable results since notches overestimate the rejection capability of the filter. Furthermore, the losses within the passband are not considered and might be excessive for excessively wide transitions widths introduced for filters with few coefficients. Such biases are compensated for by the second considered criterion which is based on computing the maximum rejection within the stopband minus the mean of the absolute value of passband rejection. With this criterion, the results are significantly improved as shown in figure~\ref{fig:custom_criterion} and meet the expected rejection capability of low pass filters. @@ -295,7 +295,7 @@ Such biases are compensated for by the second considered criterion which is base \begin{figure} \centering \includegraphics[width=\linewidth]{images/colored_custom_criterion} -\caption{Custom criterion (maximum rejection in the stopband minus the mean of the absolute value of the passband rejection) +\caption{Custom criterion (maximum rejection in the stopband minus the mean of the absolute value of the passband rejection) comparison between monolithic filter and cascaded filters} \label{fig:custom_criterion} \end{figure} @@ -304,7 +304,7 @@ Thanks to the latter criterion which will be used in the remainder of this paper and estimate their rejection. Figure~\ref{fig:rejection_pyramid} exhibits the rejection as a function of the number of coefficients and the number of bits representing these coefficients. The curve shaped as a pyramid exhibits optimum configurations sets at the vertex where both edges meet. -Indeed for a given number of coefficients, increasing the number of bits over the edge will not improve the rejection. +Indeed for a given number of coefficients, increasing the number of bits over the edge will not improve the rejection. Conversely when setting the a given number of bits, increasing the number of coefficients will not improve the rejection. Hence the best coefficient set are on the vertex of the pyramid. @@ -316,15 +316,15 @@ the rejection. Hence the best coefficient set are on the vertex of the pyramid. \end{figure} Although we have an efficient criterion to estimate the rejection of one set of coefficients (taps), -we have a problem when we cascade filters and estimate the criterion as a sum two or more individual criteria. +we have a problem when we cascade filters and estimate the criterion as a sum two or more individual criteria. If the FIR filter coefficients are the same between the stages, we have: $$F_{total} = F_1 + F_2$$ But selecting two different sets of coefficient will yield a more complex situation in which the previous relation is no longer valid as illustrated on figure~\ref{fig:sum_rejection}. The red and blue curves are two different filters with maximums and notches not located at the same frequency offsets. -Hence when summing the transfer functions, the resulting rejection shown as the dashed yellow line is improved +Hence when summing the transfer functions, the resulting rejection shown as the dashed yellow line is improved with respect to a basic sum of the rejection criteria shown as a the dotted yellow line. -Thus, estimating the rejection of filter cascades is more complex than takin the sum of all the rejection +Thus, estimating the rejection of filter cascades is more complex than takin the sum of all the rejection criteria of each filter. However since the this sum underestimates the rejection capability of the cascade, this upper bound is considered as a pessimistic and acceptable criterion for deciding on the suitability of the filter cascade to meet design criteria. @@ -423,7 +423,7 @@ rejection required. \label{sec:workflow} In this section, we describe the workflow to compute all the results presented in sections~\ref{sec:fixed_area} -and \ref{sec:fixed_rej}. Figure~\ref{fig:workflow} shows the global workflow and the different steps involved +and \ref{sec:fixed_rej}. Figure~\ref{fig:workflow} shows the global workflow and the different steps involved in the computation of the results. \begin{figure} @@ -463,7 +463,7 @@ and a deploy script ((1b) on figure~\ref{fig:workflow}). The TCL script describes the whole digital processing chain from the beginning (the raw signal data) to the end (the filtered data) in a language compatible -with proprietary synthesis software, namely Vivado for Xilinx and Quartus for +with proprietary synthesis software, namely Vivado for Xilinx and Quartus for Intel/Altera. The raw input data generated from a 20-bit Pseudo Random Number (PRN) generator inside the FPGA and $\Pi^I$ is fixed at 16~bits. Then the script builds each stage of the chain with a generic FIR task that @@ -482,7 +482,7 @@ FPGA (xc7z010clg400-1) and two LTC2145 14-bit 125~MS/s ADC, loaded with 50~$\Ome provide a broadband noise source. The board runs the Linux kernel and surrounding environment produced from the Buildroot framework available at \url{https://github.com/trabucayre/redpitaya/}: configuring -the Zynq FPGA, feeding the FIR with the set of coefficients, executing the simulation and +the Zynq FPGA, feeding the FIR with the set of coefficients, executing the simulation and fetching the results is automated. The deploy script uploads the bitstream to the board ((3) on @@ -703,9 +703,7 @@ $n$ & Time (MAX/500) & Time (MAX/1000) & Time (MAX/1500) As expected, the computation time seems to rise exponentially with the number of stages. % TODO: exponentiel ? When the area is limited, the design exploration space is more limited and the solver is able to -find an optimal solution faster. On the contrary, in the case of MAX/1500 with -5~stages, we were not able to obtain a result after 40~hours of computation when the program was -manually stopped. +find an optimal solution faster. \subsection{Minimizing resource occupation at fixed rejection}\label{sec:fixed_rej}