• 1
  • 2
  • 3
  • 4
  • 4

Multimedia Signal Processing

Scientific areas of pertinence: ING-INF/03 and ING-INF/05
The course is split into two modules. The first module covers the fundamental tools for digital signal processing. In particular it addresses topics of signal analysis and filtering concerning audio and image data both from the deterministic and the statistical viewpoint. The second module covers relevant applications of digital signal processing with specific reference to multimedia communication, e.g. audio, image and video coding. Furthermore, the course gives an insight on widely adopted international coding standards such as MPEG Audio, JPEG and MPEG Video, among others.

Module 1: Fundamentals of digital signal processing

  • Introduction to the Discrete Fourier Transform (DFT): DFT, IDFT, derivation of the DFT, Fourier theorems for the DFT
  • Introduction to digital filters: Time-domain representations, transfer function analysis, frequency response analysis
  • Windowing and Short Time Fourier Transform (STFT): Overview of windows, overlap and add, STFT
  • Digital filter design techniques: Filter specifications, IIR filter design, FIR filter design
  • Introduction to multirate processing: Downsamplig, upsampling, decimation, interpolation, polyphase filters, perfect reconstruction filter banks
  • Random sequences: Expectations, i.i.d. sequences, jointly distributed random sequences, correlation and covariance sequences, time averages and ergodicity
  • Spectral estimation: Introduction to estimation theory, estimate bias and variance, maximum likelihood estimation, Bayesian estimation, power spectral density, nonparametric spectral estimation, parametric spectral estimation
  • Linear prediction: Autocorrelation and autocovariance methods, frequency domain interpretation of linear prediction
  • Wiener filtering: Principle of orthogonality, Wiener-Hopf equations, non-causal and causal Wiener filtering, applications to denoising, echo cancellation and channel equalization

Module 2: Fundamentals of coding

  • Source coding: Discrete memoryless sources, discrete sources with memory, entropy of a source, uniquely decodable and prefix codes, Shannon’s source coding theorem, Huffman coding, arithmetic coding, run length coding
  • Quantization: Uniform scalar quantization, Lloyd-Max scalar quantization, entropy constrained scalar quantization, rate-distortion theory, vector quantization
  • Predictive coding: Linear predictive coding, DPCM, delta modulation, predictive coding gain
  • Transform coding: Linear transforms, unitary transforms, linear approximation, non-linear approximation, KLT, DCT, transform coding gain, bit allocation, sub-band coding, wavelet transform, 2D transforms
  • A review of Waveform Coding of audio signals: PCM, DPCM, Delta Modulation, ADPCM. Lossless compression techniques
  • Speech coding: Vocal tract modeling, LPC, pitch extraction, voiced/unvoiced detection, analysis by synthesis
  • Audio coding: Fundamentals of psychoacoustics, frequency masking, temporal masking, filter banks (PQMF, MDCT), bit allocation and entropy coding. Coding standards: MPEG-Audio, Advanced Audio Coding (AAC), AC3
  • Image coding: Human visual system, visual redundancy and irrelevancy, lossless and lossy image coding, transform coding and quantization. Coding standards: JPEG
  • Video coding: DPCM, motion estimation, coding of prediction residuals, coding of motion vectors, rate-distortion optimization. Coding standards: MPEG-x, H.264/AVC