This course introduces fundamentals of image and video processing, including color image capture and representation; contrast enhancement; spatial domain filtering; two-dimensional (2D) Fourier transform and frequency domain interpretation of linear convolution; image sampling and resizing; multi-resolution image representation using pyramid and wavelet transforms; feature point detection and global alignment between images based on feature correspondence; geometric transformation, image registration; video motion characterization and estimation; video stabilization and panoramic view generation; image and video segmentation; selected advanced image processing techniques; basic compression techniques and standards (JPEG image compression standard; wavelet transform and JPEG2000 standard; video compression using adaptive spatial and temporal prediction; video coding standards (MPEGx/H26x); Stereo and multi- view image and video processing (depth from disparity, disparity estimation, video synthesis, compression). Students will learn to implement selected algorithms in Python. A term project will be required.
Graduate status. EL-GY 6113 and EL-GY 6303 preferred but not required. Undergraduate students must have completed EE-UY 3054 Signals and systems and EE-UY 2233 Probability.
Professor Yao Wang, MTC2 Room 9.122, (646)-997-3469, Email: yaowang at nyu dot edu.Homepage
Yilin Song (MTC2 Room 9.123, ys1297 at nyu dot edu) and Anti-Chiang (MTC2 Room 9.123, atc327 at nyu dot edu)
Monday 12:25 AM-2:55PM, Room RH215.
Wed. 4-6PM and Thur 4-6 PM or appointment by email.
Thur. 2-5 PM (TA's office hours)
- Richard Szeliski, Computer Vision: Algorithms and Applications. (Available online:"Link") (Cover most of the material, except sparsity-based image processing and image and video coding)
- (Optional) Y. Wang, J. Ostermann, and Y.Q.Zhang, Video Processing and Communications. Prentice Hall, 2002. "Link" (Reference for image and video coding, motion estimation, and stereo)
- (Optional) R. C. Gonzalez and R. E. Woods, Digital Image Processing, Prentice Hall, (3rd Edition) 2008. ISBN number 9780131687288. "Link" (Good reference for basic image processing, wavelet transforms and image coding).
Exam: 40%, Final Project: 30%, Programming assignments: 30%, Written homework will be assigned but not graded. Solution to written HW will be provided.
Late submissions of programming assignment will be accepted up to 7 days after the deadline, with penalty of 1 pt for each day (Each assignment has 10pt). Students can work in teams (at most 3 people) and each team should submit one completed assignment. You should include your code (with documentation) and results (plots etc.) and discussions. When working in teams, each person must understand what others did and what are included in the assignment!
Middelbury Stereo Image Database
Links to Resources (lecture notes and sample exams) in Previous Offerings:
- EL 5123 Image Processing
- EL 6123 Video Processing
- EL 6123 Image and Video Processing (S16)
- The coursera image processing course by Prof. Katsaggelos:Link
- The image processing course at Stanford:Link
- The computer vision course at U. Washington:Link
Other Useful Links
- Intro to Python: First three slides of the Python course at Columbia:Link
- OpenCV: an open source package including many computer vision algorithms
- Matrix Reference Manual
- Codeacdemy : python
Tentative Course Schedule
- Week 1 (1/23): Image Formation and Representation: 3D to 2D projection, photometric image formation, trichromatic color representation, video format (SD, HD, UHD, HDR). Contrast enhancement (concept of histogram, nonlinear mapping, histogram equalization): Lecture note on image formation , Lecture note on contrast enhancement
- Tutorial on python (Time /location to be arranged)
- Week 2 (1/30): Review of 1D Fourier transform and convolution. Concept of spatial frequency. Continuous and Discrete Space 2D Fourier transform. 2D convolution and its interpretation in frequency domain. Implementation of 2D convolution. Separable filters. Frequency response. Linear filtering (2D convolution) for noise removal, image sharpening, and edge detection. Gaussian filters, DOG and LOG filters as image gradient operators.Lecture note on Fourier Transform and linear filtering
- Programming assignment 1 (Due 2/6): Linear filtering for noise removal and image sharpening and illustrating effect in both spatial and frequency domain.Programming_Filtering
- Week 3 (2/6): Image sampling and resizing. Design of interpolation filters. Multiresolution representation: Pyramid and wavelets.Lecture note on ImageSampling , Lecture note on Wavelet
- Week 4 (2/13): Sparsity and dictionary based image processing: Image representation using orthonormal transform/dictionary. Sparsity assumption. General formulation of image enhancement as an optimization problem, L0 vs. L1 vs. L2 prior, Basic optimization techniques. Applications in debluring, denoising, inpainting, compressive sensing, superresolution, dictionary learning (PCA and KSVD).
- Programming assignment 2 (Due 3/6): Programming _WaveletDenoising
- 2/20/17 Presidents’ Day. No class.
- Week 5 (2/27): Project Proposal Due
- Week 5 (2/27): Feature detection (SIFT), feature descriptors and matching, and feature based global mapping estimation (Robust least squares and RANSAC)Lecture note on Feature
- Programming assignment 3 (Due 3/20): Stitching a panoramic picture (Feature detection, finding global mapping, warping, combining).Programming_features
- Week 6 (3/6): Geometric transformation. Image warping. Image morphing. Panoramic view stitching. Video stabilizationLecture note on geometric transformation
- 3/13/17-3/19/17: Spring Recess
- Week 7 (3/20): Image segmentation: region growing, split and merge, Otsu’s method, K-means, mean-shift, normalized cut, graph cut (optional).Lecture note on image segmentation
- Programming assignment 4 (Due 3/27): Implement a specified segmentation method yourself. Compare different methods using available implementations.Programming_Kmeans
- Week 8 (3/27): Midterm Project Feedback (Individual meeting)
- Week 8 (3/27): Dense motion/displacement estimation: optical flow equation, optical flow estimation; block matching, multi-resolution estimation. Deformable registration (medical applications)Lecture note on MotionEstimation
- Week 9 (4/3): Moving object detection (background/foreground separation) (Gaussian mixture model, RPCA). Simultaneous estimation of camera motion and moving objects. Object tracking. Video shot segmentation.Lecture note on Moving object detection
- Programming assignment 5 (Due 4/10): Moving object detection Programming_Moving object detection
- Week 10 (4/10): Stereo and multiview video: depth from disparity, disparity estimation, view synthesis. Depth camera (Kinect). 360 video camera and view stitching. Light field imaging.Lecture note on Stereo and Multiview Video Processing
- Week 11 (4/17): Exam
- Week 12 (4/24): Fundamentals of source coding: characterization of random sources by entropy, binary encoding, scalar quantization, vector quantization. Lecture note on source coding basics.
- Week 13 (5/1): Transform coding: Image representation using unitary transforms (orthogonal bases), Transform coding, JPEG image compression standard, Image representation using wavelet transform; concept of layered coding, JPEG2000 image compression standard. Lecture note on unitary transforms and Transform Coding.
- Programming assignment 6 (Due 5/17): Video coding. Video Coding Assignment
- Week 14 (5/8): Part 1: Video coding: block-based motion compensated prediction and interpolation, adaptive spatial prediction, block-based hybrid video coding, rate-distortion optimized mode selection, rate control, Group of pictures (GoP) structure, tradeoff between coding efficiency, delay, and complexity. Lecture note on Video Coding. Part 2: Overview of video coding standards (AVC/H.264, HEVC/H.265); Layered video coding: general concept and H.264/SVC. Multiview video compression. Error resilience issues. Adaptive Video Streaming (DASH). Lecture note on Video Coding Standards.
- Week 15 (5/15): Project Presentation.
- 5/17: Project Report and all other material must be uploaded.
Policy on Academic Dishonesty:
The School of Engineering encourages academic excellence in an environment that promotes honesty, integrity, and fairness. Please see the policy on academic dishonesty: Link