diff options
Diffstat (limited to 'docs/resampler.html')
-rw-r--r-- | docs/resampler.html | 574 |
1 files changed, 574 insertions, 0 deletions
diff --git a/docs/resampler.html b/docs/resampler.html new file mode 100644 index 0000000..eb608e0 --- /dev/null +++ b/docs/resampler.html @@ -0,0 +1,574 @@ +<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> +<html> +<head> + <title>Zita-resampler.</title> + <meta name="Author" content="Fons Adriaensen (fons@linuxaudio.org)"> + <meta name="Description" content="The zita-resampler library."> + <meta name="Keywords" content="resampling sound audio dsp linux"> + <link rel=StyleSheet href="zitadocs.css" type="text/css" media="all"> +</head> + +<body> + + +<h1>Libzita-resampler</h1> +<ul> +<li><a href = "#introduction">Introduction</a> +<li><a href = "#algorithm">The algorithm</a> +<li><a href = "#description">API description</a> +<li><a href = "#reference">API reference</a> +</ul> + +<br> +<a name = "introduction"></a> +<h2>Introduction</h2> +<p> +Libzita-resampler is a C++ library for resampling audio signals. It is designed +to be used within a real-time processing context, to be fast, and to provide +high-quality sample rate conversion. +</p> +<p> +The library operates on signals represented in single-precision floating point +format. For multichannel operation both the input and output signals are +assumed to be stored as interleaved samples. +</p> +<p> +The API allows a trade-off between quality and CPU load. For the latter +a range of approximately 1:6 is available. Even at the highest quality +setting libzita-resampler will be faster than most similar libraries +providing the same quality, e.g. libsamplerate. +</p> +<p> +In many real-time resampling applications (e.g. an audio player), processing +is driven by the ouput sample rate: each processing period requires a fixed +number of output samples, and the input side has to adapt, providing whatever +number of samples required. The inverse situation (less common, but possible) +would be e.g. a recording application that writes an audio file at a rate that +is different from the hardware sample rate. In that case the number of input +samples is fixed for each processing period. The API provided by libzita-resampler +is fully symmetric in this respect - it handles both situations in the exactly the +same way, using the same application code. +</p> +<p> +Libzita-resampler provides two classes: +</p> +<p> +The <b>Resampler</b> class performs resampling at a fixed ratio <b>F_out / F_in</b> +which is required to be <b>≥ 1/16</b> and be reducible to the form <b> b / a</b> +with <b>a, b</b> integer and <b>b ≤ 1000</b>. This includes all the 'standard' +ratios, e.g. 96000 / 44100 = 320 / 147. These restrictions allow for a more efficient +implementation. +</p> +<p> +The <b>VResampler</b> class provides an arbitrary ratio <b>r</b> in the range +<b>1/16 ≤ r ≤ 64</b> and which can variable within a range of <b>0.95 to +16.0</b> w.r.t. the originally configured one. The lower limit here is necessary +because this class still uses a fixed multiphase filter, with only the phase step +being variable. This class was developed for converting between two nominally fixed +sample rates with a ratio which is not known exactly and may even drift slowly, e.g. +when combining sound cards wich do not have a common word clock. This resampler is +somewhat less efficient than the fixed ratio one since it has to interpolate filter +coefficients, but the difference is marginal when used on multichannel signals. +</p> +<p> +Both classes provide essentially the same API, with only small differences +where necessary. + +<br> +<a name = "algorithm"></a> +<h2>The algorithm</h2> +<p> +Libzita-resampler implements <b>constant bandwidth resampling</b>. In contrast +to e.g. cubic interpolation it does not consider the actual shape of the +waveform represented by the input samples, but rather operates in the spectral +domain. +</p> +<p> +Let +<ul> +<li><b>F_in, F_out</b>    be the input and output sample rates,</li> +<li><b>F_min</b>    the lower of the two,</li> +<li><b>F_lcm</b>    their lowest common multiple,</li> +<li><b>b = F_lcm / F_in</b>,</li> +<li><b>a = F_lcm / F_out</b></li>. +</ul> +<p> +Then the calculation performed by zita-resampler is equivalent to: +</p> +<ul> +<li>upsampling to a rate of <b>F_lcm</b> by inserting <b>b - 1</b> +zero-valued samples after each input sample,</li> +<li>low-pass filtering of the upsampled signal to remove everything +above <b>F_min / 2</b>,</li> +<li>retaining only the first of each series of <b>a</b> filtered +samples.</li> +</ul> +<p> +This is of course not how things are implemented in the resampler code. +Only those samples that are actually output are computed, and the inserted +zeros are never used. In practice this means there is a set of <b>b</b> +different FIR filters that each output one sample in turn, in round-robin +fashion. All these filters have the same frequency response, but different +delays that correspond to the relative position in time of the input and +output samples. +</p> +<p> +A real-world filter can't be perfect, it is always a compromise between +complexity (CPU load) and performance. In the context of resampling this +compromise manifests itself as deviations from the ideally flat frequency +response, and as aliasing - the same phenomenon that occurs with AD and +DA conversion. For aliasing, two cases need to be considered: +</p> +<p> +<b>Upsampling.</b> In this case <b>F_min = F_in</b>, and input signals +below but near to <b>F_in / 2</b> will also appear in the output +just above this frequency. This is similar to DA conversion. +</p> +<p> +<b>Downsampling.</b> In this case <b>F_min = F_out</b>, and input signals +above but near to <b>F_out / 2</b> will also appear in the output +just below this frequency. This is similar to AD conversion. +</p> +<p> +In the design of zita-resampler it was assumed that in most cases +conversion will be between the 'standard' audio sample rates (44.1, +48, 88.2, 96, 192 kHz), and that consequently frequency response +errors and aliasing will occur only above the upper limit of the +audible frequency range. Given this assumption, some pragmatic +trade-offs can be made. +</p> +<p> +The filter used by libzita-resampler is dimensioned to reach an +attenuation of 60dB at the Nyquist frequency. The initialisation +function takes a parameter named <b>hlen</b> that is in fact half +the length of the symmetrical FIR filter expressend in samples at +the rate <b>F_min</b>. The valid range for <b>hlen</b> is 16 to 96. +The figure below shows the filter responses for <b>hlen</b> = 32, +48, and 96. The <b>x</b> axis is <b>F / F_min</b>, the <b>y</b> axis +is in dB. The lower part of the traces is the mirrored continuation +of the response above the Nyquist frequency, i.e. the aliasing. +Note that 20 kHz corresponds to x = 0.416 for a sample rate of 48 kHz, +and to x = 0.454 for a sample rate of 44.1 kHz. +</p> +<p> +<img src="filt1.png"> +</p> +<p> +<br> +The same traces with a reduced vertical range, showing the passband +response. +</p> +<p> +<img src="filt2.png"> +</p> +<p> +From these figures it should be clear that <b>hlen = 32</b> should +provide very high quality for <b>F_min</b> equal to 48 kHz or higher, +while <b>hlen = 48</b> should be sufficient for an <b>F_min</b> of +44.1 kHz. The validity of these assumptions was confirmed by a series +of listening tests. If fact the conclusion of these test was that even +at 44.1 kHz the subjects (all audio specialists or musicians) could not +detect any significant difference between <b>hlen</b> values of 32 and 96. +</p> + +<br> +<a name = "description"></a> +<h2>API description</h2> +<p> +The constructors of both classes do not initialise the object for a particular +resampling rate, quality or number of channels - this is done by a separate +function member which can be used as many times as necessary. This means you +can allocate <b>(V)Resampler</b> objects before the actual resampling parameters +are known. +</p> +<p> +The <b>setup ()</b> member initialises a resampler for a combination of input +sample rate, output sample rate, number of channels, and filter length. This +function allocates and computes the filter coefficient tables and is definitely +not RT-safe. The actual tables will be shared with other resampler instances if +possible - the library maintains a reference-counted collection of them. After +the initialisation by <b>setup ()</b>, the <b>process ()</b> member can be called +repeatedly to actually resample audio signals. This function is RT-safe and +can be used within e.g. a JACK callback. The <b>clear ()</b> member restores +the object to the initial state it has after construction, and is also called +from the destructor. You can safely call <b>setup ()</b> again without first +calling <b>clear ()</b> - doing this avoids recomputation of the filter +coefficients in some cases. +</p> +<p> +Both classes have four public data members which are used as input and output +parameters of <b>process ()</b>, in the same way as you would use a C struct +to pass parameters. These are: +</p> +<pre> + unsigned int inp_count; // number of frames in the input buffer + unsigned int out_count; // number of frames in the output buffer + float *inp_data; // pointer to first input frame + float *out_data; // pointer to first output frame +</pre> +<p> +As <b>process ()</b> does its work, it increments the pointers and decrements the +counts. The call returns when either of the counts is zero, i.e. when the input +buffer is empty or the output buffer is full. You should then take appropriate +action and if necessary call <b>process ()</b> again. +</p> +<p> +When <b>process ()</b> returns, the four parameter values exactly reflect +those parts of both buffers <b>that have not yet been used</b>. +One of them will be fully used, with the corresponding count being zero. +The remaining part of the other, pointed to by the returned pointer, can +always be replaced before the next call to <b>process ()</b>, but this is +entirely optional. +</p> +<p> +When for example <b>inp_count</b> is zero, you have to fill the input buffer +again, or provide a new one, and re-initialise the input count and pointer. +If at that time <b>out_count</b> is not zero, you can either leave the output +parameters as they are for the next call to <b>process ()</b>, or you could +empty the part of the output buffer that has been filled and re-use it from +the start, or provide a completely different one. +</p> +<p> +The same applies to the input buffer when it is not empty on return of +<b>process ()</b>: it can be left alone or be replaced. A number of input +samples is stored internally between <b>process ()</b> calls as part of the +resampler state, but this never includes samples that have not yet been used. +So you can 'revise' the input data, starting from the frame pointed to by the +returned <b>inp_data</b>, up to the last moment. +</p> +<p> +All this means that both classes will interface easily with fixed input and +output buffers, with dynamically generated input signals, and also with +lock-free circular buffers. +</p> +<p> +Either of the two pointers can be NULL. When <b>inp_data</b> is zero, the effect +is to insert zero-valued input samples, as if you had supplied a zero-filled +buffer of length <b>inp_count</b>. When <b>out_data</b> is zero, input samples +will be consumed, the internal state of the resampler will advance normally +and <b>out_count</b> will decrement, but no output samples are written (in +fact they are not even computed). +</p> +<a name = "padding"></a> +<p> +Note that libzita-resampler does <b>not</b> automatically insert zero-valued +input samples at the start and end of the resampling process. The API makes it +easy to add such padding, and doing this is left entirely up to the user. +</p> +<p> +The <b>inpsize ()</b> member returns the lenght of the FIR filter expressed in +input samples. At least this number of samples is required to produce an output +sample. If <b>k</b> is the value returned by this function, then +</p> +<ul> +<li>inserting <b>k / 2 - 1</b> zero-valued samples at the start will align the +first input and output samples,</li> +<li>inserting <b>k - 1</b> zero valued samples will ensure that the output +includes the full filter response for the first input sample.</li> +</ul> +<p> +Similar considerations apply at the end of the input data: +</p> +<ul> +<li>inserting <b>k / 2</b> zero-valued samples at the end will ensure +that the last output sample produced will correspond to a position as close +as possible but not past the last real input sample,</li> +<li>inserting <b>k - 1</b> zero valued samples will ensure that the output +includes the full filter response for the last real input sample.</li> +</ul> +<p> +<a name = "inpdist"></a> +The <b>inpdist ()</b> member returns the distance or delay expressed in sample +periods at the input rate, between the first output sample that will be ouput +by the next call to <b>process ()</b> and the first input sample that will be +read (i.e. that is not yet part of the internal state). +<br><br> +In the picture below, the red dots represent input samples and the blue ones +are the output. Solid dots are samples already used or output. After a call +to process () the resampler object remains in a state ready to produce the next +output sample, except that it may have to input one or more new samples first. +The filter is aligned with the next output sample. In this case one more input +sample is required to compute it, and the input distance is 2.7. +</p> +<p> +<img src="inpdist.png"> +</p> +<p> +After a resampler is prefilled with <b>inpsize () / 2 - 1</b> samples as described +above, <b>inpdist ()</b> will be zero - the first output sample corresponds exactly +to the first input. Note that without prefilling the distance is negative, which +means that the first output sample corresponds to some point past the start of the +input data. +</p> +<p> +The 'resample' application supplied with the library sources provides +an example of how to use the <b>Resampler</b> class. For an example +using <b>VResampler</b> you can have a look at zita_a2j and zita_ja2. +</p> + +<br> +<a name = "reference"></a> +<h2>API reference</h2> +<p> +Public function members of the <b>Resampler</b> and <b>VResampler</b> classes. +Functions listed without a class prefix are available for both classes. +</p> +<ul> +<li><a href = "#resampler">Resampler ()</a> +<li><a href = "#resampler">~Resampler ()</a> +<li><a href = "#vresampler">VResampler ()</a> +<li><a href = "#vresampler">~VResampler ()</a> +<li><a href = "#res_setup">Reampler::setup ()</a> +<li><a href = "#vres_setup">VResampler::setup ()</a> +<li><a href = "#clear">clear ()</a> +<li><a href = "#reset">reset ()</a> +<li><a href = "#process">process ()</a> +<li><a href = "#nchan">nchan ()</a> +<li><a href = "#inpsize">inpsize ()</a> +<li><a href = "#inpdist">inpdist ()</a> +<li><a href = "#vres_set_rratio">VResampler::set_rratio ()</a> +<li><a href = "#vres_set_rrfilt">VResampler::set_rrfilt ()</a> +</ul> +<p> +Public data members of the <b>Resampler</b> and <b>VResampler</b> classes. +</p> +<ul> +<li><a href = "#inp_count">inp_count</a> +<li><a href = "#out_count">out_count</a> +<li><a href = "#inp_data">inp_data</a> +<li><a href = "#out_data">out_data</a> +</ul> + +<a name="resampler"></a> +<br><p class="apihdr">Resampler (void);<br>~Resampler (void);</p> +<p> +<b>Description:</b> Constructor and destructor. The constructor just creates an object that takes +almost no memory but needs to be configured by <a href="#setup"><b>setup ()</b></a> before it +can be used. The destructor calls <a href="#clear"><b>clear ()</b></a>. +<br><br> +<b>RT-safe:</b> No +</p> + +<a name="vresampler"></a> +<br><p class="apihdr">VResampler (void);<br>~VResampler (void);</p> +<p> +<b>Description:</b> Constructor and destructor. The constructor just creates an object that takes +almost no memory but needs to be configured by <a href="#setup"><b>setup ()</b></a> before it +can be used. The destructor calls <a href="#clear"><b>clear ()</b></a>. +<br><br> +<b>RT-safe:</b> No +</p> + +<a name="res_setup"></a> +<br><p class="apihdr">int   Resampler::setup (unsigned int   fs_inp, unsigned int fs_out, unsigned int nchan, unsigned int hlen);</p> +<p> +<b>Description:</b> Configures the object for a combination of input / output sample rates, number +of channels, and filter lenght.<br>If the parameters are OK, creates the filter coefficient tables +or re-uses existing ones, allocates some internal resources, and returns via <a href="#reset"> +<b>reset ()</b></a>. +<br><br> +<b>Parameters:</b> +</p> +<p class="indent"> +<b>fs_inp, fs_out:</b> The input and output sample rates. The ratio <b>fs_out +/ fs_inp</b> must be <b> ≥ 1/16</b> and reducible to the form <b>b / a</b> +with <b>a, b</b> integer and <b>b ≤ 1000</b>. +<br><br> +<b>nchan:</b> Number of channels, must not be zero. +<br><br> +<b>hlen:</b> Half the lenght of the filter expressed in samples at the lower of +input and output rates. This parameter determines the 'quality' as explained +<a href="#algorithm">here</a>. For any fixed combination of the other parameters, +cpu load will be roughly proportional to <b>hlen</b>. The valid range is +<b>16 ≤ hlen ≤ 96</b>. +</p> +<p> +<b>Returns:</b> Zero on success, non-zero otherwise. +<br><br> +<b>Remark:</b> It is perfectly safe to call this function again without +having called <a href="#clear"><b>clear ()</b></a> first. If only the number +of channels is changed, doing this will avoid recalculation of the filter tables +even if they are not shared. +<br><br> +<b>RT-safe:</b> No +</p> + +<a name="vres_setup"></a> +<br><p class="apihdr">int   VResampler::setup (double ratio, unsigned int nchan, unsigned int hlen);</p> +<p> +<b>Description:</b> Configures the object for a combination of resampling ratio, number of channels, +and filter lenght.<br>If the parameters are OK, creates the filter coefficient tables or re-uses +existing ones, allocates some internal resources, and returns via <a href="#reset"> +<b>reset ()</b></a>. +<br><br> +<b>Parameters:</b> +</p> +<p class="indent"> +<b>ratio:</b> The resampling ratio wich must be between <b>1/16</b> and <b>64</b>. +<br><br> +<b>nchan:</b> Number of channels, must not be zero. +<br><br> +<b>hlen:</b> Half the lenght of the filter expressed in samples at the lower of +the input and output rates. This parameter determines the 'quality' as explained +<a href="#algorithm">here</a>. For any fixed combination of the other parameters, +cpu load will be roughly proportional to <b>hlen</b>. The valid range is +<b>16 ≤ hlen ≤ 96</b>. +</p> +<p> +<b>Returns:</b> Zero on success, non-zero otherwise. +<br><br> +<b>Remark:</b> It is perfectly safe to call this function again without +having called <a href="#clear"><b>clear ()</b></a> first. If only the number +of channels is changed, doing this will avoid recalculation of the filter tables +even if they are not shared. +<br><br> +<b>RT-safe:</b> No +</p> + +<a name="clear"></a> +<br><p class="apihdr">void   clear (void);</p> +<p> +<b>Description:</b> Deallocates resources (if not shared) and returns the object to +the unconfigured state as after construction. Also called by the destructor. +<br><br> +<b>RT-safe:</b> No +</p> + +<a name="reset"></a> +<br><p class="apihdr">int   reset (void);</p> +<p> +<b>Description:</b> Resets the internal state of the resampler. Any stored +input samples are cleared, the filter phase and the four <a href="#inp_count"> +public data members</a> are set to zero. This should be called before starting +resampling a new stream with the same configuration as the previous one. +<br><br> +<b>Returns:</b> Zero if the resampler is configured , non-zero otherwise. +<br><br> +<b>RT-safe:</b> Yes. +</p> + +<a name="process"></a> +<br><p class="apihdr">int   process (void);</p> +<p> +<b>Description:</b> Resamles the input signal until either the input buffer +is empty or the output buffer is full. Information on the input and output +buffers is passed to this function using the four <a href="#inp_count"> public +data members</a> described below. The same four values are updated on return. +<br><br> +<b>Returns:</b> Zero if the resampler is configured, non-zero otherwise. +<br><br> +<b>RT-safe:</b> Yes. +</p> + +<a name="nchan"></a> +<br><p class="apihdr">int   nchan (void);</p> +<p> +<b>Description:</b> Accessor. +<br><br> +<b>Returns:</b> The number of channels the resampler is configured for, +or zero if it is unconfigured. Input and output buffers are assumed to contain +this number of channels in interleaved format. +<br><br> +<b>RT-safe:</b> Yes +</p> + +<a name="inpsize"></a> +<br><p class="apihdr">int   inpisze (void);</p> +<p> +<b>Description:</b> Accessor. +<br><br> +<b>Returns:</b> If the resampler is configured, the lenght of the +finite impulse filter expressed in samples at the input sample rate, +or zero otherwise. This value may be used to determine the number of +silence samples to insert at the start and end when resampling e.g. +an impulse response. See <a href="#padding">here</a> for more about this. +This function was called 'filtlen ()' in previous releases. +<br><br> +<b>RT-safe:</b> Yes +</p> + +<a name="inpdist"></a> +<br><p class="apihdr">double   inpdist (void);</p> +<p> +<b>Description:</b> Accessor. +<br><br> +<b>Returns:</b> If the resampler is configured, the distance between +the next output sample and the next input sample, expressed in sample +periods at the input rate, zero otherwise. See <a href="#inpdist"> +here</a> for more about this. +<br><br> +<b>RT-safe:</b> Yes +</p> + +<a name="vres_set_rratio"></a> +<br><p class="apihdr">void   VResampler::set_rratio (double ratio);</p> +<p> +<b>Description:</b> Sets the resampling ratio relative to the one configured +by <a href="#vres_setup"> setup ()</a>. The valid range is <b>0.95 ≤ +ratio ≤ 16</b>. +<br><br> +<b>Parameters:</b> +</p> +<p class="indent"> +<b>ratio:</b> The relative resampling ratio. +</p> +<p> +<b>RT-safe:</b> Yes +</p> + +<a name="vres_set_rrfilt"></a> +<br><p class="apihdr">void   VResampler::set_rrfilt (double time);</p> +<p> +<b>Description:</b> Sets the time constant of the first order filter applied +on values set by <a href="#vres_set_rratio"> set_rratio ()</a>. The time is +expressed as sample periods at the output rate. The default is zero, which +means changes are applied instantly. +<br><br> +<b>Parameters:</b> +</p> +<p class="indent"> +<b>time:</b> The filter time constant. +</p> +<p> +<b>RT-safe:</b> Yes +</p> + +<a name="inp_count"></a> +<br><p class="apihdr">unsigned int   inp_count;</p> +<p> +<b>Description:</b> Data member, input / output parameter of the <a href="#process"> +process ()</a> function. This value is always equal to the number of frames in the +input buffer that have not yet been read by the process () function. It should be set +to the number of available frames before calling process (). +</p> + +<a name="out_count"></a> +<br><p class="apihdr">unsigned int   out_count;</p> +<p> +<b>Description:</b> Data member, input / output parameter of the <a href="#process"> +process ()</a> function. This value is always equal to the number of frames in the +output buffer that have not yet been written by the process () function. It should be +set to the size of the output buffer before calling process (). +</p> + +<a name="inp_data"></a> +<br><p class="apihdr">float   *inp_data;</p> +<p> +<b>Description:</b> Data member, input / output parameter of the <a href="#process"> +process ()</a> function. If not zero (NULL), this points to the next frame in the +input buffer that will be read by process (). If set to zero (NULL), the resampler +will proceed normally but use zero-valued samples as input (still counted by +<b>inp_count</b>). +</p> + +<a name="out_data"></a> +<br><p class="apihdr">float   *out_data;</p> +<p> +<b>Description:</b> Data member, input / output parameter of the <a href="#process"> +process ()</a> function. If not zero (NULL), this points to the next frame in the +output buffer that will be written by process (). If set to zero (NULL), the resampler +will proceed normally but the output is discarded (but still counted by <b>out_count</b>). +</p> + +</body> +</html> |