Discrete-Time Signals#

Once an analog signal has been sampled, it becomes a sequence of numbers — discrete in time and (after quantization) discrete in amplitude. Working with these sequences requires a slightly different vocabulary and a few concepts that don’t exist in the continuous world: sample indexing, block processing, windowing, and the subtle errors that come from treating finite chunks of data as if they represent infinite signals.

Samples and Sequences#

A discrete-time signal x[n] is a sequence of values indexed by an integer n. Each value corresponds to a sample taken at time t = n/f_s, where f_s is the sample rate. The square-bracket notation x[n] (vs parentheses x(t) for continuous signals) is a deliberate convention that flags “this is discrete.”

Sequence operations are analogous to continuous-time operations:

Delay: x[n - k] shifts the sequence by k samples. One sample of delay at 48 kHz is about 20.8 µs
Scaling: a × x[n] multiplies every sample by a constant
Addition: x[n] + y[n] adds corresponding samples (same rate assumed)
Convolution: y[n] = Σ h[k] × x[n - k] — the fundamental operation of digital filtering

The sample rate is metadata — the sequence itself is just numbers. This means the same sequence of values could represent a 1 kHz signal at 48 kHz sample rate, or a 10 kHz signal at 480 kHz sample rate. All frequency information is relative to the sample rate.

Frames and Blocks#

Real-time digital signal processing doesn’t operate on individual samples one at a time (usually). Instead, samples are collected into blocks (frames) of N samples, and the DSP algorithm processes entire blocks:

Block size (N): Typical values range from 32 to 4096 samples. Larger blocks are more computationally efficient (setup overhead is amortized) and provide better frequency resolution for spectral analysis
Block latency: Processing can’t begin until a full block is collected, adding at minimum N/f_s seconds of latency. A 256-sample block at 48 kHz adds 5.3 ms minimum

The latency-efficiency tradeoff:

Block Size	Latency at 48 kHz	Use Case
32	0.67 ms	Real-time monitoring, low-latency effects
128	2.67 ms	Live audio processing
256	5.33 ms	General audio (near perception threshold)
1024	21.3 ms	Spectral analysis, non-real-time effects
4096	85.3 ms	High-resolution FFT, offline processing

For real-time audio, block sizes above ~512 samples at 48 kHz (>10 ms) start to become perceptible as latency. Musical performance monitoring needs ≤5 ms — see Latency & Throughput.

Windowing#

When analyzing a finite block of samples, the block is implicitly multiplied by a rectangular window — ones inside the block, zeros outside. This abrupt truncation creates artifacts in the frequency domain (spectral leakage): energy from a single frequency smears across many frequency bins.

Window functions taper the edges of the block to reduce leakage, trading off frequency resolution for reduced sidelobes:

Window	Main Lobe Width	Sidelobe Level	Use Case
Rectangular	Narrowest	-13 dB (worst)	When signals are exactly periodic in the window
Hann (Hanning)	Moderate	-31 dB	General-purpose spectral analysis
Hamming	Moderate	-43 dB	Similar to Hann, slightly better sidelobes
Blackman	Wide	-58 dB	When dynamic range matters more than resolution
Flat-top	Widest	-44 dB	Amplitude-accurate measurements (calibration)
Kaiser	Adjustable (β)	Adjustable	When the tradeoff needs tuning

The tradeoff: Wider main lobe means worse ability to distinguish nearby frequencies. Lower sidelobes mean less leakage from strong signals into weak-signal bins. There’s no window that wins on both — this is another manifestation of the time-frequency uncertainty principle.

Overlap processing: To avoid losing information at the tapered block edges, consecutive blocks typically overlap by 50-75%. Each sample appears in multiple blocks, weighted by the window function at its position. The overlap-add or overlap-save method reconstructs a continuous output from overlapping windowed blocks.

Circular vs Linear Convolution#

A subtlety that causes real bugs: the FFT inherently computes circular (periodic) convolution, not the linear convolution needed for filtering. A block of N samples convolved with a filter of length M using a straightforward FFT approach produces wrap-around artifacts unless the FFT length is at least N + M - 1.

Practical solution: Zero-pad both the signal block and the filter to length N + M - 1 before computing the FFT. The overlap-add and overlap-save methods systematize this for continuous processing of sequential blocks.

Tips#

Choose block size based on latency requirements first, then optimize for efficiency
Use Hann or Blackman windows for general spectral analysis — rectangular windows cause excessive leakage
When using FFT-based convolution, always zero-pad to avoid circular convolution artifacts

Caveats#

Off-by-one errors are everywhere — A block of N samples spans indices 0 to N-1. An N-point FFT produces N/2 + 1 unique frequency bins (the rest are conjugate mirrors for real signals). The highest frequency bin represents f_s/2, not f_s. These fencepost errors compound in multi-stage processing
Sample rate must be tracked explicitly — The sequence x[n] has no inherent notion of time or frequency. If a sequence is processed without knowing its sample rate, the results can’t be interpreted in physical units. Every processing chain should carry the sample rate as metadata
Block boundaries create discontinuities — If a signal is processed in blocks without proper overlap and windowing, the block boundaries can introduce clicks, pops, or spectral artifacts. This is especially problematic for time-varying operations (filters with changing coefficients)
Zero-padding does not create resolution — Padding a block with zeros and taking a larger FFT interpolates the spectrum (smoother appearance) but does not increase the fundamental frequency resolution. True resolution depends on the signal duration, not the FFT size
Integer indexing hides timing precision — A sample at index n represents a moment in time, but the actual sampling instant has jitter. For high-resolution systems, the assumption of perfectly uniform spacing breaks down — see Clocking & Jitter

In Practice#

Clicks or pops at regular intervals in processed audio indicate block boundary discontinuities — verify overlap and windowing
Spectral analysis showing smeared peaks rather than clean lines indicates spectral leakage — use an appropriate window function
Unexpected frequency components in FFT output may be wrap-around artifacts from insufficient zero-padding
A processing chain that works at one sample rate but fails at another often has hardcoded values that should be sample-rate-dependent