Exactly what is the role of the zero-order hold in a hybrid analog/digital sampled-data system?

Question

I'll admit, I am asking this question rhetorically. I am curious what answers will come back out of this.

If you choose to answer this, make sure you understand the Shannon-Nyquist sampling theorem well. Particularly reconstruction. Also be careful of "gotchas" in textbooks. The engineering notion of the dirac delta impulse function is sufficient. You need not worry about all of the "distribution" stuff, the dirac impulse as a nascent delta function is good enough:

$$ \delta(t) = \lim_{\tau \to 0} \frac{1}{\tau} \operatorname{rect}\left(\frac{t}{\tau} \right) $$

where

$$ \mathrm{rect}(t) \triangleq \begin{cases} 0 & \mbox{if } |t| > \frac{1}{2} \\ 1 & \mbox{if } |t| < \frac{1}{2} \\ \end{cases} $$

Issues regarding precision, bit width of sample words, and quantization done in conversion is not relevant to this question. But scaling from input to output is relevant.

I will write my own answer to it eventually unless someone else presents an accurate and pedagogically useful answer. I might even put a bounty on this (may as well spend what little rep I have).

Have at it.

nope. i am assuming all of the rules of the Sampling Theorem are adhered to. that is, no content or energy in the continuous-time input getting sampled that is at or above \$ \frac{f_\mathrm{s}}{2} \$. now, remember there is a difference between "aliases" and "images". — robert bristow-johnson, Dec 17 '15 at 01:05
as far as I remember, the zero-order hold is merely the delay between samples in the digital system, and obviously can affect the analog side of things between one sample and the next — KyranF, Dec 23 '15 at 17:46
@robertbristow-johnson from the answers given by Timo indeed it looks more involved than I thought. Good luck with that! — KyranF, Dec 23 '15 at 20:24
@KyranF , wait for the bounty period to end. it's pretty simple but oft misunderstood. — robert bristow-johnson, Dec 23 '15 at 20:36
okay, i figgered out what @ScottSeidman is doing. thanks, Scott. — robert bristow-johnson, Jan 08 '16 at 03:56
You should remove all the text that isn't actually part of the question. Like the parts talking about how you want the question to be answered. They are irrelevant. — user253751, Sep 08 '20 at 18:20
and, @user253751, the fact is that you don't know the hell what you're typing about. rigor is not irrelevant. — robert bristow-johnson, Sep 08 '20 at 19:17
@robertbristow-johnson "I will write my own answer eventually" is irrelevant. "Have at it" is irrelevant. "I am asking this rhetorically" is irrelevant. "Be careful of gotchas in textbooks" is irrelevant. "You need to know [...]" is relevant but should not be part of a question. — user253751, Sep 08 '20 at 19:31

Timo · Answer 1 · 2015-12-25T14:21:59.890

8

Set-up

We consider a system with an input signal \$x(t)\$, and for clarity we refer to the values of \$x(t)\$ as voltages, where necessary. Our sample period is \$T\$, and the corresponding sample rate is \$f_s \triangleq 1/T\$.

For the Fourier transform, we choose the conventions $$ X(i 2 \pi f) = \mathcal{F}(x(t)) \triangleq \int_{-\infty}^\infty x(t) e^{-i 2 \pi f t} \mathrm{d}t, $$ giving the inverse Fourier transform $$ x(t) = \mathcal{F}^{-1}(X(i 2 \pi f)) \triangleq \int_{-\infty}^\infty X(i 2 \pi f) e^{i 2 \pi f t} \mathrm{d}f. $$ Note that with these conventions, \$X\$ is a function of the Laplace variable \$s = i \omega = i 2 \pi f\$.

Ideal sampling and reconstruction

Let us start from ideal sampling: according to the Nyquist-Shannon sampling theorem, given a signal \$x(t)\$ which is bandlimited to \$f < \frac{1}{2} f_s\$, i.e. $$ X(i 2 \pi f) = 0,\qquad \mathrm{when }\, |f| \geq \frac{1}{2}f_s, $$ then the original signal can be perfectly reconstructed from the samples \$x[n] \triangleq x(n T)\$, where \$n\in \mathbb{Z}\$. In other words, given the condition on the bandwidth of the signal (called the Nyquist criterion), it is sufficient to know its instantaneous values at equidistant discrete points in time.

The sampling theorem also gives an explicit method for carrying out the reconstruction. Let us justify this in a way which will be helpful in what follows: let us estimate the Fourier transform \$X(i 2 \pi f)\$ of a signal \$x(t)\$ by its Riemann sum with step \$T\$: $$ X(i 2 \pi f) \sim \sum_{n = -\infty}^\infty x(n \Delta t)e^{-i 2 \pi f n \Delta t} \Delta t, $$ where \$\Delta t = T\$. Let us rewrite this as an integral, to quantify the error we're making: $$ \begin{align} \sum_{n = -\infty}^\infty x(n T)e^{-i 2 \pi f n T}T &= \int_{-\infty}^\infty \sum_{n = -\infty}^\infty x(t)e^{-i 2 \pi f t}T \delta(t - n T)\mathrm{d}t\\ &= X(i 2 \pi f) * \mathcal{F}(T \sum_{n = -\infty}^\infty \delta(t - nT)) \\ &= \sum_{k = -\infty}^\infty X(f - k/T), \tag{1} \label{discreteft} \end{align} $$ where we used the convolution theorem on the product of \$x(t)\$ and the sampling function \$\sum_{n = -\infty}^\infty T \delta(t - n T)\$, the fact that the Fourier transform of the sampling function is \$\sum_{n = -\infty}^\infty \delta(f - k/T)\$, and carried out the integral over the delta functions.

Note that the left hand side is exactly \$T X_{1/T}(i 2 \pi f T)\$, where \$ X_{1/T}(i 2 \pi f T)\$ is the discrete time Fourier transform of the corresponding sampled signal \$x[n] \triangleq x(n T)\$, with \$f T\$ the dimensionless discrete time frequency.

Here we see the essential reason behind the Nyquist criterion: it is exactly what is required to guarantee that the terms of the sum don't overlap. With the Nyquist criterion, the above sum reduces to the periodic extension of the spectrum from the interval \$[-f_s/2, f_s/2]\$ to the whole real line.

Since the DTFT in \eqref{discreteft} has the same Fourier transform in the interval \$[-f_s/2, f_s/2]\$ as our original signal, we can simply multiply it by the rectangular function \$\mathrm{rect}(f/f_s)\$ and get back the original signal. Via the convolution theorem, this amounts to convolving the Dirac comb with the Fourier transform of the rectangular function, which in our conventions is $$ \mathcal{F}(\mathrm{rect}(f/f_s)) = 1/ T \mathrm{sinc}(t/T), $$ where the normalized sinc function is $$ \mathrm{sinc}(x) \triangleq \frac{\sin(\pi x)}{\pi x}. $$ The convolution then simply replaces each Dirac delta in the Dirac comb with a sinc -function shifted to the position of the delta, giving $$ x(t) = \sum_{n = -\infty}^\infty x[n] \mathrm{sinc}(t/T - n). \tag{2} \label{sumofsinc} $$ This is the Whittaker-Shannon interpolation formula.

Non-ideal sampling

For translating the above theory to the real world, the most difficult part is guaranteeing the bandlimiting, which must be done before sampling. For the purposes of this answer, we assume this has been done. The remaining task is then to take samples of the instantenous values of the signal. Since a real ADC will need a finite amount of time to form the approximation to the sample, the usual implementation will store the value of the signal to a sample-and-hold -circuit, from which the digital approximation is formed.

Even though this resembles very much a zero-order-hold, it is a distinct process: the value obtained from the sample-and-hold is indeed exactly the instantenous value of the signal, up to the approximation that the signal stays constant for the duration it takes to charge the capacitor holding the sample value. This is usually well achieved by real world systems.

Therefore, we can say that a real world ADC, ignoring the problem of bandlimiting, is a very good approximation to the case of ideal sampling, and specifically the "staircase" coming from the sample-and-hold does not cause any error in the sampling by itself.

Non-ideal reconstruction

For reconstruction, the goal is to find an electronic circuit that accomplishes the sum-of-sincs appearing in \$\eqref{sumofsinc}\$. Since the sinc has an infinite extent in time, it is quite clear that this cannot be exactly realized. Further, forming such a sum of signals even to a reasonable approximation would require multiple sub-circuits, and quickly become very complex. Therefore, usually a much simpler approximation is used: at each sampling instant, a voltage corresponding to the sample value is output, and held constant until the next sampling instant (although see Delta-sigma modulation for an example of an alternative method). This is the zero-order hold, and corresponds to replacing the sinc we used above with the rectangle function \$1/T\mathrm{rect}(t/T - 1/2)\$. Evaluating the convolution $$ (1/T\mathrm{rect}(t/T - 1/2))*\left(\sum_{n = -\infty}^\infty T x[n] \delta(t - n T)\right), $$ using the defining property of the delta function, we see that this indeed results in the classic continuous-time staircase waveform. The factor of \$1/T\$ enters to cancel the \$T\$ introduced in \eqref{discreteft}. That such a factor is needed is also clear from the fact that the units of an impulse response are 1/time.

The shift by \$-1/2 T\$ is simply to guarantee causality. This only amounts to a shift of the output by 1/2 sample relative to using \$1/T \mathrm{rect}(1/T)\$ (which may have consequences in real-time systems or when very precise synchronization to external events is needed), which we will ignore in what follows.

Comparing back to \eqref{discreteft}, we have replaced the rectangular function in the frequency domain, which left the baseband entirely untouched and removed all of the higher frequency copies of the spectrum, called images, with the Fourier transform of the function \$1/T \mathrm{rect}(t/T)\$. This is of course $$ \mathrm{sinc}(f/f_s). $$

Note that the logic is somewhat inverted from the ideal case: there we defined our goal, which was to remove the images, in the frequency domain, and derived the consequences in time domain. Here we defined how to reconstruct in the time domain (since that is what we know how to do), and derived the consequences in the frequency domain.

So the result of the zero-order hold is that instead of the rectangular windowing in the frequency domain, we end up with the sinc as a windowing function. Therefore:

The frequency response is no longer bandlimited. Rather it decays as \$1/f\$, with the upper frequencies being images of the original signal
in the baseband, the response already decays considerably, reaching about -4 dB at \$1/2 f_s\$

Overall, the zero-order hold is used to approximate the time-domain sinc function appearing in the Whittaker-Shannon interpolation formula. When sampling, the similar-looking sample-and-hold is a technical solution to the problem of estimating the instantaneous value of the signal, and does not produce any errors in itself.

Note that no information is lost in the reconstruction either, since we can always filter out the high frequency images after the initial zero-order hold. The gain loss can also be compensated by an inverse sinc filter, either before or after the DAC. So from a more practical point of view, the zero-order hold is used to construct an initial implementable approximation to the ideal reconstruction, which can then be further improved, if necessary.

edited Dec 25 '15 at 14:21

answered Dec 23 '15 at 15:27

Timo

1,179
1
12
30

it's interesting Timo. you are running into a consequence of Wikipedia politics. check out [this older version of the Wikipedia article on the sampling theorem](https://en.wikipedia.org/w/index.php?title=Nyquist%E2%80%93Shannon_sampling_theorem&oldid=185170430). rather than hiding behind the Poisson summation formula, it just shows how sampling generates the images and explicitly what is required to recover the original continuous-time signal. and you can see **why** there is that \$T\$ factor in the sampling function. – robert bristow-johnson Dec 23 '15 at 17:47
It's interesting that the old version of the Wikipedia article is actually clearer, also in my opinion. The calculation is almost exactly what I write above, except that it gives a bit more details. – Timo Dec 23 '15 at 18:18
Anyway, I'm not quite sure why this is needed to understand why the factor of \$T\$ is needed: I think that what I write in the answer is a sufficient condition for the \$T\$ to be necessary (technically, a consistency condition, but we already assume that reconstruction is possible). Now, of course, understanding is always a subjective thing. For example, here it could be considered a deeper reason for the appearance of the factor \$T\$ that \$T\$ essentially becomes the integration measure \$\mathrm{d}t\$ when one takes the limit \$T \rightarrow 0\$. – Timo Dec 23 '15 at 18:23
I suppose what you're referring to as the why is the appearance of 1/T in the representation of the Dirac comb as a sum of complex exponentials, in https://en.wikipedia.org/w/index.php?title=Nyquist%E2%80%93Shannon_sampling_theorem&oldid=185170430#Mathematical_basis_for_the_theorem? Which is of course one way to put it, and quite directly related to the role of \$T\$ as a measure. – Timo Dec 23 '15 at 18:33
..and above I should of course have written "...role of \$\delta\$ as a measure". See https://en.wikipedia.org/wiki/Dirac_delta_function#As_a_measure , and note that if we were to use this interpretation to build a Dirac comb to approximate the continuum, then units of \$\mathrm{d}t\$ have to match those of \$\delta\{\mathrm{d}t\}\$, which means we will anyway have to include some factor with units of time. \$T\$ is the only such factor that the limit of a Dirac comb with a decreasing interval between the "teeth" becomes the continuum measure \$\mathrm{d}t\$ on continuous functions. – Timo Dec 23 '15 at 18:38
the reason why the \$T\$ is necessary in the sampling function is so that the brick-wall reconstruction filter has a gain of 1 (and **dimensionless**) in the passband. most textbooks put a gain of \$T\$ in the passband of the reconstruction LPF and then we get students coming here (or to the USENET newsgroup *comp.dsp*) asking how to design a LPF with a passband gain of \$T\$. – robert bristow-johnson Dec 23 '15 at 19:07
This is of course true. From another point of view, it's a consequence of what I wrote above, since it ultimately stems from requiring *consistency* in the sense that the baseband Fourier transform of the continuous signal is the same as the discrete time Fourier transform of the samples x[n]. Overall, this means that there will be one factor of \$T\$ in one of these, where ever you like to put it. This is clear also from the fact that \$\mathrm{d}t\$ has units of time, whereas the sum in the DTFT is dimensionless. Then also the continuous time brickwall filter will have passband gain of 1. – Timo Dec 23 '15 at 20:02
well, it's not where ever we like to put it, if we want to settle clearly what the **net** effect of the ZOH is. it is **not** $$ H(s) = \frac{1 - e^{-sT}}{s} $$ and remember that the dimension of the dirac delta is in units of 1/time, and for a filter that has the same species of animal coming out (like voltage) as going in, then the transfer function must be dimensionless and the impulse response \$h(t)\$ must have the same dimension as the impulse, 1/time. – robert bristow-johnson Dec 23 '15 at 20:09
Of course, by where ever I meant in any of the three, the FT, DTFT or definition of x[n]. With any of these, the FT and DTFT have the same units, which is required so that there's any sense in comparing them in the first place. – Timo Dec 23 '15 at 22:24
no, i don't think the FT and DTFT have the same units. we compare them by the use of scaling factors. the FT has dimension of time (because of that \$dt\$ as you noted). and the frequency variable (whether it's \$f\$ or \$\Omega\$) has dimension 1/time. the frequency variable for the DTFT, \$\omega\$, is dimensionless. – robert bristow-johnson Dec 23 '15 at 22:46
actually the FT of \$x(t)\$ has dimension of time multiplied by the dimension of of \$x(t)\$. the impulse response, \$h(t)\$ of something with the same species of animal output as is input has dimension 1/time. so the FT of \$h(t)\$ is dimensionless, as a frequency response should be. – robert bristow-johnson Dec 23 '15 at 22:56
Well, that's a matter of definition. When the DTFT is defined directly for a discrete time system, you indeed naturally get a dimensionless frequency (cycles/sample). However, when the purpose is to compare a continuous time and a discrete time system as here, it's natural to use nT as the time variable, still giving a dimensionful frequency (this convention is used, for example, here: https://en.wikipedia.org/wiki/Discrete-time_Fourier_transform). Then including the factor of \$T\$ also in the definition, you get the same units, dimension of \$x\$ times dimension of time, so for example V/Hz – Timo Dec 24 '15 at 07:47
it **is** a matter of definition. DTFT: $$ X_\text{DTFT}(\omega) \triangleq \sum\limits_{n=- \infty}^{+\infty} x[n] e^{-j \omega n} $$ FT: $$ X_\text{FT}(f) \triangleq \int\limits_{-\infty}^{+\infty} x(t) e^{-j 2 \pi f t} \ dt $$ and $$ x[n] \triangleq x(nT) $$ so, i think then $$ X_\text{DTFT}(\omega) = \sum\limits_{k=- \infty}^{+\infty} \frac{1}{T}X_\text{FT}\left( f-\frac{k}{T} \right) \bigg|_{f=\frac{ \omega}{2 \pi T}} $$ – robert bristow-johnson Dec 24 '15 at 08:00
, as I say in my first answer, btw. Anyway, the scaling factor you mention is of course exactly \$T\$, which boils down to the same end result. You get it from requiring that the reconstruction filter has dimensionless DC gain 1, I'm just saying that it also comes naturally from simply requiring same units for the FT and DTFT, and that they're the same in the baseband. One more way to get the result: estimate the integral in the FT by a Riemann sum, drop the limit, and you get exactly the DTFT with dimensionful frequency (\$2 \pi f n T\$ in the exp) and the necessary factor of \$T\$ in front. – Timo Dec 24 '15 at 08:01
oh, if you insist that the unit time is defined to be the sampling period, \$T\$, then that is the same as saying \$T=1\$. then i agree. but if we measure time in physical units like "seconds" or "microseconds", and we measure frequency in "Hz" or "kHz", then \$T\$ is not 1 because \$T\$ is not dimensionless, and neither are either \$t\$ nor \$dt\$ dimensionless. – robert bristow-johnson Dec 24 '15 at 08:04
what is the dimension of "\$2 \pi f n T \$"? – robert bristow-johnson Dec 24 '15 at 08:07
Uh, no. I'm specifically insisting that \$T\$ is the dimensionful sampling period, in seconds for example. Then, if I want to keep track of time also in the discrete domain, everything else will be dimensionless (such as \$n\$), and factors of \$T\$ keep track of dimensionful real world units of time. – Timo Dec 24 '15 at 08:08
\$2 \pi f n T\$ is dimensionless, since \$f\$ is 1/time, and \$T\$ has units of time. So the dimensionless frequency is the combination \$f T\$, which is indeed a fraction of the sample rate. – Timo Dec 24 '15 at 08:10
then, if you insist that (which i also insist), then \$X_\text{DTFT}(\omega)\$ has the same dimension as \$x[n]\$ which carries the same dimension as \$x(t)\$. and \$\omega\$ is dimensionless. but \$X_\text{FT}(f)\$ has the dimension of \$x(t)\$ multiplied with "time" (as a dimension) and the argument \$f\$ has dimension of 1/time. – robert bristow-johnson Dec 24 '15 at 08:13
of course, inside the DSP or computer that is doing the discrete-time portion of the processing of the hybrid system, there \$x[n]\$ is dimensionless because \$x(t)\$ is relative to the \$V_\text{REF}\$ of the ADC and the DAC, but i didn't want this problem to be messing around with that. i am assuming the ADC and DAC convert the same kind of quantity (probably voltage) and that the \$V_\text{REF}\$ for both are the same. so i just carry the dimension and units of \$x(t)\$ along for the ride. – robert bristow-johnson Dec 24 '15 at 08:16
Okay, this is a matter of convention, right? If you want to keep the DTFT dimensionless, then for comparing the FT and DTFT, you have the \$1/T\$ in front when you express the DTFT in terms of the FT, as you write above. This is perfectly fine. The other option is to define the DTFT with the factor of \$T\$ in front: $$ X_{DTFT} \triangleq \sum_{n=-\infty}^\infty T x[n] e^{-i 2\pi f n T} $$ , and the third is to define \$x[n] \triangleq T x(nT)\$. These are the three options I mention above, and with any of those you get consistent units, DC gain 1, and an impulse response with units 1/time. – Timo Dec 24 '15 at 08:19
i know of **no** convention that defines the DTFT differently than i have shown. in fact, the whole idea of defining discrete-time signals as \$x[n]\$ and the DTFT solely in terms of \$x[n]\$ is to *get rid* of the time dimension. the discrete-time signal, the \$\mathcal{Z}\$-transform, the DTFT, the DFT are **all** agnostic about sampling frequency, sampling period, aliasing, physical reality, whatever. they are **solely** mathematical constructs. – robert bristow-johnson Dec 24 '15 at 08:23
I think the third option is the weirdest of these, and that the dimensionful DTFT is the most natural, since there it is explicit that the discrete time system represents a real world system. However, I have no problem with your way to put it either, where the DTFT lives in an abstract discrete time domain, and the factors of \$T\$ are derived in order to compare with the FT when necessary. I'm just saying that there are several valid ways to think of this. – Timo Dec 24 '15 at 08:28
https://en.wikipedia.org/wiki/Discrete-time_Fourier_transform "alternative definition", (eq. 3). Note the definition \$x[n] \triangleq T x(nT)\$ a few lines above, canceling the \$1/T\$ from the Dirac delta, and so the units of the alternative definition are the same as those of the FT. – Timo Dec 24 '15 at 08:32
anything is valid if we define it to be valid. i am not *"agreeable"* to defining the DTFT or Z-transform or DFT differently than they are in the textbooks. (sometimes we might scale the DFT differently with a \$\frac{1}{\sqrt{N}}\$ to make it and its inverse self-similar, but i normally don't want to do that.) the one thing i **do** want to do differently than the textbooks is i put the \$T\$ factor in the sampling function, so that the reconstruction filter has no gain factor (it just eliminates images) and so that the scaling of the ZOH makes sense. the textbooks fail in this. – robert bristow-johnson Dec 24 '15 at 08:33
Well, I think we agree on the main point: in order to compare the FT and DTFT, and have consistent units throughout the sampling chain, there is a factor of \$T\$. In addition, it's certainly not in the reconstruction filter, since then your output that the filter acts upon would not have the same units as \$x(t)\$, which is clearly wrong. – Timo Dec 24 '15 at 08:36
the "alternative definition" and the like is how a particular editor (named "Bob K") has been crapping up Wikipedia for years. this is what i was alluding to regarding Wikipedia politics. sometimes Wikipedia articles get worse and worse in time. they don't always improve in time. i was trying to hold the line on that several years ago and have since sorta given up. (or i am editing now with different, pared back, goals of maintaining accuracy and pedagogy.) – robert bristow-johnson Dec 24 '15 at 08:38
*"In addition, it's certainly not in the reconstruction filter, since then your output that the filter acts upon would not have the same units as \$x(t)\$, which is clearly wrong."* $$ $$ we agree with each other, but the textbooks **do not**. every textbook (with one exception, Pohlmann, and it's because i influenced the author) puts the \$T\$ factor as the passband gain in the reconstruction LPF. that leads to two mistakes with students groping to understand this stuff. one is "how can i make an analog reconstruction filter with passband gain of 0.022675 ms? and the other is the ZOH. – robert bristow-johnson Dec 24 '15 at 08:42
So anyway, your goal with this question was then to get an answer which demonstrates clearly why the reconstruction filter has gain 1, in a reasonably pedagogical way? I think the main point is then to just make sure you're consistent with the units, and justify that in a way which makes intuitive sense. I do think that the DTFT with \$T\$ in front is quite intuitive, but having the \$T\$ in the sampling function is not bad either. That actually comes pretty naturally from the Riemann sum comparison between the FT and DTFT. – Timo Dec 24 '15 at 08:43
*"So anyway, your goal with this question was then to get an answer which demonstrates clearly why the reconstruction filter has gain 1, in a reasonably pedagogical way?"* that's a byproduct. the goal will be demonstrated when the bounty period is over. – robert bristow-johnson Dec 24 '15 at 08:46
Ok. I edited the answer to use the standard definition of the DTFT, since I do agree that it's better to stick to conventions, when that is possible. Also added a further remark to the end. Anyway, looking forward to hearing what further insight you have once the bounty period is over :) – Timo Dec 24 '15 at 10:57
if you wanna fix it a little more, your reconstruction filter is not quite "implementable", since it is non-causal. not a big deal, but should be corrected. – robert bristow-johnson Dec 24 '15 at 18:32
1

I can't help but think you should just add the answer you're after. Comments are not for extended discussion. – David Dec 25 '15 at 15:56
so @David, i did exactly that. i'm happy to award Timo with the bounty. – robert bristow-johnson Dec 27 '15 at 02:31

Timo · Answer 2 · 2015-12-23T10:10:41.597

The zero-order hold has the role of approximating the delta and \$\mathrm{sinc}\$ -functions appearing in the sampling theorem, whichever is appropriate.

For the purposes of clarity, I consider an ADC/DAC system with a voltage signal. All of the following applies to any sampling system with the appropriate change of units, though. I also assume that the input signal has already been magically bandlimited to fulfill the Nyquist criterion.

Start from sampling: ideally, one would sample the value of the input signal at a single instant. Since real ADC's need a finite amount of time to form their approximation, the instantaneous voltage is approximated by the sample-and-hold (instantaneous being approximated by the switching time used to charge the capacitor). So in essence, the hold converts the problem of applying a delta functional to the signal to the problem of measuring a constant voltage.

Note here that the difference between the input signal being multiplied by an impulse train or a zero-order hold being applied at the same instants is merely a question of interpretation, since the ADC will nonetheless store only the instantaneous voltages being held. One can be reconstructed from the other. For the purposes of this answer, I will adopt the interpretation that the sampled signal is the continuous-time signal of the form $$x(t) = \frac{\Delta t V_\mathrm{ref}}{2^n}\sum_k x_k \delta(t - k \Delta t),$$ where \$V_\mathrm{ref}\$ is the reference voltage of the ADC/DAC, \$n\$ is the number of bits, \$x_k\$ are the samples represented in the usual way as integers, and \$\Delta t\$ is the sampling period. This somewhat unconventional interpretation has the advantage that I am considering, at all times, a continuous-time signal, and sampling here simply means representing it in terms of the numbers \$x_k\$, which are indeed the samples in the usual sense.

In this interpretation, the spectrum of the signal in the baseband is the exact same as that of the original signal, and the effective convolution by the impulse train has the effect of replicating that signal such as to make the spectrum periodic. The replicas are called images of the spectrum. That the normalization factor \$\Delta t\$ is necessary can be seen, for example, by considering the DC-offset of a 1 volt pulse of duration \$\Delta t\$: its DC-offset defined as the \$f = 0\$ -component of the Fourier transform is $$ \hat{x}(0) = \int_0^{\Delta t} 1\mathrm{V}\mathrm{d}t = 1\mathrm{V} \Delta t.$$ In order to get the same result from our sampled version, we must indeed include the factor of \$\Delta t\$.*

Ideal reconstruction then means constructing an electrical signal that has the same baseband spectrum as this signal, and no components at frequencies outside this range. This is the same as convolving the impulse train with the appropriate \$\mathrm{sinc}\$-function. This is quite challenging to do electronically, so the \$\mathrm{sinc}\$ is often approximated by a rectangular function, AKA zero-order hold. In essence, at each delta function, the value of the sample is held for the duration of the sampling period.

In order to see what consequences this has for the reconstructed signal, I observe that the hold is exactly equivalent to convolving the impulse train with the rectangular function $$\mathrm{rect}_{\Delta t}(t) = \frac{1}{\Delta t}\mathrm{rect}\left(\frac{t}{\Delta t}\right).$$ The normalization of this rectangular function is defined by requiring that a constant voltage is correctly reproduced, or in other words, if a voltage \$V_1\$ was measured when sampling, the same voltage is output on reconstruction.

In the frequency domain, this amounts to multiplying the frequency response with the Fourier transform of the rectangular function, which is $$ \hat{\mathrm{rect}}_{\Delta t}(f) = \mathrm{sinc}(\pi \Delta t f). $$ Note that the gain at DC is \$1\$. At high frequencies, the \$\mathrm{sinc}\$ decays like \$1/f\$, and therefore attenuates the images of the spectrum.

In the end, the \$\mathrm{sinc}\$-function resulting from the zero-order hold behaves as a low-pass filter on the signal. Note that no information is lost in the sampling phase (assuming the Nyquist criterion), and in principle, neither is anything lost when reconstructing: the filtering in the baseband by the \$\mathrm{sinc}\$ could be compensated for by an inverse filter (and this is indeed sometimes done, see for example https://www.maximintegrated.com/en/app-notes/index.mvp/id/3853). The modest \$-6\mathrm{dB/octave}\$ decay of the \$\mathrm{sinc}\$ usually requires some form of filtering to further attenuate the images.

Note also that an imaginary impulse generator that could physically reproduce the impulse train used in the analysis would output an infinite amount of energy in reconstructing the images. This would also cause some hairy effects, such as that an ADC re-sampling the output would see nothing, unless it were perfectly synchronized to the original system (it would mostly sample between the impulses). This shows clearly that even if we cannot bandlimit the output exactly, some approximate bandlimiting is always needed to regularize the total energy of the signal, before it can be converted to a physical representation.

To summarize:

in both directions, the zero-order-hold acts as an approximation to a delta function, or it's band-limited form, the \$\mathrm{sinc}\$ -function.
from the frequency domain point of view, it is an approximation to the brickwall filter that removes images, and therefore regulates the infinite amount of energy present in the idealized impulse train.

*This is also clear from dimensional analysis: the units of a Fourier transform of a voltage signal are \$\mathrm{V}\mathrm{s} = \frac{\mathrm{V}}{\mathrm{Hz}}, \$ whereas the delta function has units of \$1/\mathrm{s}\$, which would cancel the unit of time coming from the integral in the transform.

when the timer allows me to, i will put a bounty on this, Timo. there are some things that i like: e.g. having the DC gain = 1, which is consistent with Eq. 1 on your maxim citation, but **way** too many textbooks screw it up with a gain of \$T\$ that they don't know what to do with. and it appears that you are understanding that the ZOH has nothing to do with any possible S/H at the input of the ADC. that's good. i'll still wait for a little more rigorous answer. and don't worry about \$V_\text{ref}\$. i am assuming it's the same for the ADC and DAC. — robert bristow-johnson, Dec 18 '15 at 19:25
@robertbristow-johnson: thanks for the kind words! Can you specify a little in what direction are you looking for more rigor? More details, more maths proof style answer, or something completely different? — Timo, Dec 18 '15 at 21:42
i guess a mathematical treatment with clean and consistent mathematical notation. i would suggest being consistent with Oppenheim and Wilsky or something like that. $$T \triangleq \frac{1}{f_\text{s}}$$ $$ x[n] \triangleq x(nT) $$ perhaps, so that the Laplace and Fourier transforms have consistent and compatible notation $$ \mathcal{F}\{x(t)\} = X(j2\pi f) \triangleq \int\limits_{-\infty}^{+\infty} x(t) e^{-j2\pi ft} \ dt $$ . discuss what the sampling theorem is saying and how it is different in reality and where the ZOH comes in on that. — robert bristow-johnson, Dec 18 '15 at 21:59
Ok, let me actually try writing another answer, since editing this to change the notation to what you prefer etc would probably leave a bit of mess. I'll just fix a small mistake from this one first, since it bothers me... — Timo, Dec 23 '15 at 10:09
i was a little confused and slow-on-the-draw and didn't hit the bounty icon to award your bounty. according to the rules: *If you do not award your bounty within 7 days (plus the grace period), the highest voted answer created after the bounty started with a minimum score of 2 will be awarded half the bounty amount. If two or more eligible answers have the same score (i.e., their scores are tied), the oldest answer is awarded the bounty. If there's no answer meeting those criteria, the bounty is not awarded to anyone.* -- according to these rules you should get it within a week. — robert bristow-johnson, Dec 28 '15 at 01:57
Yes, I noticed you asked about that in the meta section. Thanks for taking that up on my behalf, and thanks for the bounty! Also, the very next paragraph in the meta: *If the bounty was started by the question owner, and the question owner accepts an answer posted during the bounty period, and the bounty expires without an explicit award then we assume the bounty owner liked the answer they accepted and award it the full bounty amount at the time of bounty expiration.* --but I guess the 7 days + grace affects this too, so I suppose I'll get it next week, as you say. — Timo, Dec 28 '15 at 02:53

robert bristow-johnson · Accepted Answer · 2020-09-08T18:10:32.617

Fourier Transform: $$ X(j 2 \pi f) = \mathscr{F}\Big\{x(t)\Big\} \triangleq \int\limits_{-\infty}^{+\infty} x(t) \ e^{-j 2 \pi f t} \ \text{d}t $$

Inverse Fourier Transform: $$ x(t) = \mathscr{F}^{-1}\Big\{X(j 2 \pi f)\Big\} = \int\limits_{-\infty}^{+\infty} X(j 2 \pi f) \ e^{j 2 \pi f t} \ \text{d}f $$

Rectangular pulse function: $$ \operatorname{rect}(u) \triangleq \begin{cases} 0 & \mbox{if } |u| > \frac{1}{2} \\ 1 & \mbox{if } |u| < \frac{1}{2} \\ \end{cases} $$

"Sinc" function ("sinus cardinalis"): $$ \operatorname{sinc}(v) \triangleq \begin{cases} 1 & \mbox{if } v = 0 \\ \frac{\sin(\pi v)}{\pi v} & \mbox{if } v \ne 0 \\ \end{cases} $$

Define sampling frequency, \$ f_\text{s} \triangleq \frac{1}{T} \$ as the reciprocal of the sampling period \$T\$.

Note that: $$ \mathscr{F}\Big\{\operatorname{rect}\left( \tfrac{t}{T} \right) \Big\} = T \ \operatorname{sinc}(fT) = \frac{1}{f_\text{s}} \ \operatorname{sinc}\left( \frac{f}{f_\text{s}} \right)$$

Dirac comb (a.k.a. "sampling function" a.k.a. "Sha function"):

$$ \operatorname{III}_T(t) \triangleq \sum\limits_{n=-\infty}^{+\infty} \delta(t - nT) $$

Dirac comb is periodic with period \$T\$. Fourier series:

$$ \operatorname{III}_T(t) = \sum\limits_{k=-\infty}^{+\infty} \frac{1}{T} e^{j 2 \pi k f_\text{s} t} $$

Sampled continuous-time signal:

$$ \begin{align} x_\text{s}(t) & = x(t) \cdot \left( T \cdot \operatorname{III}_T(t) \right) \\ & = x(t) \cdot \left( T \cdot \sum\limits_{n=-\infty}^{+\infty} \delta(t - nT) \right) \\ & = T \ \sum\limits_{n=-\infty}^{+\infty} x(t) \ \delta(t - nT) \\ & = T \ \sum\limits_{n=-\infty}^{+\infty} x(nT) \ \delta(t - nT) \\ & = T \ \sum\limits_{n=-\infty}^{+\infty} x[n] \ \delta(t - nT) \\ \end{align} $$

where \$ x[n] \triangleq x(nT) \$.

This means that \$x_\text{s}(t)\$ is defined solely by the samples \$x[n]\$ and the sampling period \$T\$ and totally loses any information of the values of \$x(t)\$ for times in between sampling instances. \$x[n]\$ is a discrete sequence of numbers and is a sorta DSP shorthand notation for \$x_n\$. While it is true that \$x_\text{s}(t) = 0\$ for \$ nT < t < (n+1)T \$, the value of \$x[n]\$ for any \$n\$ not an integer is undefined.

N.B.: The discrete signal \$x[n]\$ and all discrete-time operations on it, like the \$\mathcal{Z}\$-Transform, the Discrete-Time Fourier Transform (DTFT), the Discrete Fourier Transform (DFT), are "agnostic" regarding the sampling frequency or the sampling period \$T\$. Once you're in the discrete-time \$x[n]\$ domain, you do not know (or care) about \$T\$. It is only with the Nyquist-Shannon Sampling and Reconstruction Theorem that \$x[n]\$ and \$T\$ are put together.

The Fourier Transform of \$x_\text{s}(t)\$ is

$$ \begin{align} X_\text{s}(j 2 \pi f) \triangleq \mathscr{F}\{ x_\text{s}(t) \} & = \mathscr{F}\left\{x(t) \cdot \left( T \cdot \operatorname{III}_T(t) \right) \right\} \\ & = \mathscr{F}\left\{x(t) \cdot \left( T \cdot \sum\limits_{k=-\infty}^{+\infty} \frac{1}{T} e^{j 2 \pi k f_\text{s} t} \right) \right\} \\ & = \mathscr{F}\left\{ \sum\limits_{k=-\infty}^{+\infty} x(t) \ e^{j 2 \pi k f_\text{s} t} \right\} \\ & = \sum\limits_{k=-\infty}^{+\infty} \mathscr{F}\left\{ x(t) \ e^{j 2 \pi k f_\text{s} t} \right\} \\ & = \sum\limits_{k=-\infty}^{+\infty} X\left(j 2 \pi (f - k f_\text{s})\right) \\ \end{align} $$

Important note about scaling: The sampling function \$ T \cdot \operatorname{III}_T(t) \$ and the sampled signal \$x_\text{s}(t)\$ has a factor of \$T\$ that you will not see in nearly all textbooks. That is a pedagogical mistake of the authors of these of these textbooks for multiple (related) reasons:

First, leaving out the \$T\$ changes the dimension of the sampled signal \$x_\text{s}(t)\$ from the dimension of the signal getting sampled \$x(t)\$.
That \$T\$ factor will be needed somewhere in the signal chain. These textbooks that leave it out of the sampling function end up putting it into the reconstruction part of the Sampling Theorem, usually as the passband gain of the reconstruction filter. That is dimensionally confusing. Someone might reasonably ask: "How do I design a brickwall LPF with passband gain of \$T\$?"
As will be seen below, leaving the \$T\$ out here results in a similar scaling error for the net transfer function and net frequency response of the Zero-order Hold (ZOH). All textbooks on digital (and hybrid) control systems that I have seen make this mistake and it is a serious pedagogical error.

Note that the DTFT of \$x[n]\$ and the Fourier Transform of the sampled signal \$x_\text{s}(t)\$ are, with proper scaling, virtually identical:

DTFT: $$ \begin{align} X_\mathsf{DTFT}(\omega) & \triangleq \mathcal{Z}\{x[n]\} \Bigg|_{z=e^{j\omega}} \\ & = X_\mathcal{Z}(e^{j\omega}) \\ & = \sum\limits_{n=-\infty}^{+\infty} x[n] \ e^{-j \omega n} \\ \end{align} $$

It can be shown that

$$ X_\mathsf{DTFT}(\omega) = X_\mathcal{Z}(e^{j\omega}) = \frac{1}{T} X_\text{s}(j 2 \pi f)\Bigg|_{f=\frac{\omega}{2 \pi T}} $$

The above math is true whether \$x(t)\$ is "properly sampled" or not. \$x(t)\$ is "properly sampled" if \$x(t)\$ can be fully recovered from the samples \$x[n]\$ and knowledge of the sampling rate or sampling period. The Sampling Theorem tells us what is necessary to recover or reconstruct \$x(t)\$ from \$x[n]\$ and \$T\$.

If \$x(t)\$ is bandlimited to some bandlimit \$B\$, that means

$$ X(j 2 \pi f) = 0 \quad \quad \text{for all} \quad |f| > B $$

Consider the spectrum of the sampled signal made up of shifted images of the original:

$$ X_\text{s}(j 2 \pi f) = \sum\limits_{k=-\infty}^{+\infty} X\left(j 2 \pi (f - k f_\text{s})\right) $$

The original spectrum \$X(j 2 \pi f)\$ can be recovered from the sampled spectrum \$X_\text{s}(j 2 \pi f)\$ if none of the shifted images, \$X\left(j 2 \pi (f - k f_\text{s})\right)\$, overlap their adjacent neighbors. This means that the right edge of the \$k\$-th image (which is \$X\left(j 2 \pi (f - k f_\text{s})\right)\$) must be entirely to the left of the left edge of the (\$k+1\$)-th image (which is \$X\left(j 2 \pi (f - (k+1) f_\text{s})\right)\$). Restated mathematically,

$$ k f_\text{s} + B < (k+1) f_\text{s} - B $$

which is equivalent to

$$ f_\text{s} > 2B $$

If we sample at a sampling rate that exceeds twice the bandwidth, none of the images overlap, the original spectrum, \$X(j 2 \pi f)\$, which is the image where \$k=0\$ can be extracted from \$X_\text{s}(j 2 \pi f)\$ with a brickwall low-pass filter that keeps the original image (where \$k=0\$) unscaled and discards all of the other images. That means it multiplies the original image by 1 and multiplies all of the other images by 0.

$$ \begin{align} X(j 2 \pi f) & = \operatorname{rect}\left( \frac{f}{f_\text{s}} \right) \cdot X_\text{s}(j 2 \pi f) \\ & = H(j 2 \pi f) \ X_\text{s}(j 2 \pi f) \\ \end{align} $$

The reconstruction filter is

$$ H(j 2 \pi f) = \operatorname{rect}\left( \frac{f}{f_\text{s}} \right) $$

and has acausal impulse response:

$$ h(t) = \mathscr{F}^{-1} \{H(j 2 \pi f)\} = f_\text{s} \operatorname{sinc}(f_\text{s}t) $$

This filtering operation, expressed as multiplication in the frequency domain is equivalent to convolution in the time domain:

$$ \begin{align} x(t) & = h(t) \circledast x_\text{s}(t) \\ & = h(t) \circledast T \ \sum\limits_{n=-\infty}^{+\infty} x[n] \ \delta(t-nT) \\ & = T \ \sum\limits_{n=-\infty}^{+\infty} x[n] \ (h(t) \circledast \delta(t-nT) ) \\ & = T \ \sum\limits_{n=-\infty}^{+\infty} x[n] \ h(t-nT)) \\ & = T \ \sum\limits_{n=-\infty}^{+\infty} x[n] \ \left(f_\text{s} \operatorname{sinc}(f_\text{s}(t-nT)) \right) \\ & = \sum\limits_{n=-\infty}^{+\infty} x[n] \ \operatorname{sinc}(f_\text{s}(t-nT)) \\ & = \sum\limits_{n=-\infty}^{+\infty} x[n] \ \operatorname{sinc}\left( \frac{t-nT}{T}\right) \\ \end{align} $$

That spells out explicitly how the original \$x(t)\$ is reconstructed from the samples \$x[n]\$ and knowledge of the sampling rate or sampling period.

So what is output from a practical Digital-to-Analog Converter (DAC) is neither

$$ \sum\limits_{n=-\infty}^{+\infty} x[n] \ \operatorname{sinc}\left( \frac{t-nT}{T}\right) $$

which needs no additional treatment to recover \$x(t)\$, nor

$$ x_\text{s}(t) = \sum\limits_{n=-\infty}^{+\infty} x[n] \ T \delta(t-nT) $$

which, with an ideal brickwall LPF recovers \$x(t)\$ by isolating and retaining the baseband image and discarding all of the other images.

What comes out of a conventional DAC, if there is no processing or scaling done to the digitized signal, is the value \$x[n]\$ held at a constant value until the next sample is to be output. This results in a piecewise-constant function:

$$ x_\text{DAC}(t) = \sum\limits_{n=-\infty}^{+\infty} x[n] \ \operatorname{rect}\left(\frac{t-nT - \frac{T}{2}}{T} \right) $$

Note the delay of \$\frac{1}{2}\$ sample period applied to the \$\operatorname{rect}(\cdot)\$ function. This makes it causal. It means simply that

$$ x_\text{DAC}(t) = x[n] = x(nT) \quad \quad \text{when} \quad nT \le t < (n+1)T $$

Stated differently

$$ x_\text{DAC}(t) = x[n] = x(nT) \quad \quad \text{for} \quad n = \operatorname{floor}\left( \frac{t}{T} \right)$$

where \$\operatorname{floor}(u) = \lfloor u \rfloor\$ is the floor function, defined to be the largest integer not exceeding \$u\$.

This DAC output is directly modeled as a linear time-invariant system (LTI) or filter that accepts the ideally sampled signal \$x_\text{s}(t)\$ and for each impulse in the ideally sampled signal, outputs this impulse response:

$$ h_\text{ZOH}(t) = \frac{1}{T} \operatorname{rect}\left(\frac{t - \frac{T}{2}}{T} \right) $$

Plugging in to check this...

$$ \begin{align} x_\text{DAC}(t) & = h_\text{ZOH}(t) \circledast x_\text{s}(t) \\ & = h_\text{ZOH}(t) \circledast T \ \sum\limits_{n=-\infty}^{+\infty} x[n] \ \delta(t-nT) \\ & = T \ \sum\limits_{n=-\infty}^{+\infty} x[n] \ (h_\text{ZOH}(t) \circledast \delta(t-nT) ) \\ & = T \ \sum\limits_{n=-\infty}^{+\infty} x[n] \ h_\text{ZOH}(t-nT)) \\ & = T \ \sum\limits_{n=-\infty}^{+\infty} x[n] \ \frac{1}{T} \operatorname{rect}\left(\frac{t - nT - \frac{T}{2}}{T} \right) \\ & = \sum\limits_{n=-\infty}^{+\infty} x[n] \ \operatorname{rect}\left(\frac{t - nT - \frac{T}{2}}{T} \right) \\ \end{align} $$

The DAC output \$x_\text{DAC}(t)\$, as the output of an LTI system with impulse response \$h_\text{ZOH}(t)\$ agrees with the piecewise constant construction above. And the input to this LTI system is the sampled signal \$x_\text{s}(t)\$ judiciously scaled so that the baseband image of \$x_\text{s}(t)\$ is exactly the same as the spectrum of the original signal being sampled \$x(t)\$. That is

$$ X(j 2 \pi f) = X_\text{s}(j 2 \pi f) \quad \quad \text{for} \quad -\frac{f_\text{s}}{2} < f < +\frac{f_\text{s}}{2} $$

The original signal spectrum is the same as the sampled spectrum, but with all images, that had appeared due to sampling, discarded.

The transfer function of this LTI system, which we call the Zero-order hold (ZOH), is the Laplace Transform of the impulse response:

$$ \begin{align} H_\text{ZOH}(s) & = \mathscr{L} \{ h_\text{ZOH}(t) \} \\ & \triangleq \int\limits_{-\infty}^{+\infty} h_\text{ZOH}(t) \ e^{-s t} \ \text{d}t \\ & = \int\limits_{-\infty}^{+\infty} \frac{1}{T} \operatorname{rect}\left(\frac{t - \frac{T}{2}}{T} \right) \ e^{-s t} \ \text{d}t \\ & = \int\limits_0^T \frac{1}{T} \ e^{-s t} \ \text{d}t \\ & = \frac{1}{T} \quad \frac{1}{-s}e^{-s t}\Bigg|_0^T \\ & = \frac{1-e^{-sT}}{sT} \\ \end{align}$$

The frequency response is obtained by substituting \$ j 2 \pi f \rightarrow s \$

$$ \begin{align} H_\text{ZOH}(j 2 \pi f) & = \frac{1-e^{-j2\pi fT}}{j2\pi fT} \\ & = e^{-j\pi fT} \frac{e^{j\pi fT}-e^{-j\pi fT}}{j2\pi fT} \\ & = e^{-j\pi fT} \frac{\sin(\pi fT)}{\pi fT} \\ & = e^{-j\pi fT} \operatorname{sinc}(fT) \\ & = e^{-j\pi fT} \operatorname{sinc}\left(\frac{f}{f_\text{s}}\right) \\ \end{align}$$

This indicates a linear phase filter with constant delay of one-half sample period, \$\frac{T}{2}\$, and with gain that decreases as frequency \$f\$ increases. This is a mild low-pass filter effect. At DC, \$f=0\$, the gain is 0 dB and at Nyquist, \$f=\frac{f_\text{s}}{2}\$ the gain is -3.9224 dB. So the baseband image has some of the high frequency components reduced a little.

As with the sampled signal \$x_\text{s}(t)\$, there are images in sampled signal \$x_\text{DAC}(t)\$ at integer multiples of the sampling frequency, but those images are significantly reduced in amplitude (compared to the baseband image) because \$|H_\text{ZOH}(j 2 \pi f)|\$ passes through zero when \$f = k\cdot f_\text{s}\$ for integer \$k\$ that is not 0, which is right in the middle of those images.

Concluding:

The Zero-order hold (ZOH) is a linear time-invariant model of the signal reconstruction done by a practical Digital-to-Analog converter (DAC) that holds the output constant at the sample value, \$x[n]\$, until updated by the next sample \$x[n+1]\$.
Contrary to the common misconception, the ZOH has nothing to do with the sample-and-hold circuit (S/H) one might find preceding an Analog-to-Digital converter (ADC). As long as the DAC holds the output to a constant value over each sampling period, it doesn't matter if the ADC has a S/H or not, the ZOH effect remains. If the DAC outputs something other than the piecewise-constant output (such as a sequence of narrow pulses intended to approximate dirac impulses) depicted above as \$x_\text{DAC}(t)\$, then the ZOH effect is not present (something else is, instead) whether there is a S/H circuit preceding the ADC or not.
The net transfer function of the ZOH is $$ H_\text{ZOH}(s) = \frac{1-e^{-sT}}{sT} $$ and the net frequency response of the ZOH is $$ H_\text{ZOH}(j 2 \pi f) = e^{-j\pi fT} \operatorname{sinc}(fT) $$ Many textbooks leave out the \$T\$ factor in the denominator of the transfer function and that is a mistake.
The ZOH reduces the images of the sampled signal \$x_\text{s}(t)\$ significantly, but does not eliminate them. To eliminate the images, one needs a good low-pass filter as before. Brickwall LPFs are an idealization. A practical LPF may also attenuate the baseband image (that we want to keep) at high frequencies, and that attenuation must be accounted for as with the attenuation that results from the ZOH (which is less than 3.9224 dB attenuation). The ZOH also delays the signal by one-half sample period, which may have to be taken in consideration (along with the delay of the anti-imaging LPF), particularly if the ZOH is in a feedback loop.

I'll admit your answer is cleaner and a bit more thorough than mine. I was still left wondering, what was the big reveal? Maybe that you wanted to emphasize the zero-order hold as a *model* of the DAC-output? — Timo, Jan 06 '16 at 20:25
your answer has some mistakes. for instance, it doesn't show the 1/2 sample delay in the frequency response. sorry that the way things happened that *our* bounty (what was *mine* and should now be *yours*) went down the toilet. — robert bristow-johnson, Jan 06 '16 at 22:48
Well, I do mention it (in the longer one), although I do then brush it under the carpet, which I think I did because of mostly thinking about DSP in terms of audio, where a 1/2 sample delay is insignificant (unless theres another path which introduces an undelayed copy). Basically I just didn't want to carry the factor of \$e^{-i \pi f T}\$ all the way to the end, so that's a part of what I'm saying you're more thorough. — Timo, Jan 06 '16 at 22:56
@Timo, now you got twice the rep as me. when are you gonna post a bounty that i can take a stab at?? :-) — robert bristow-johnson, Jan 25 '16 at 07:27