Computer Vision - CIS 798 X (Spring, even years)
CIS 798: Introduction to Computer Vision
First offering: Fall, 2022
Credit hours: 3
Course Description
Computer vision is an interdisciplinary topic area that deals with how to develop systems and particularly algorithms for image formation and processing to gain a high-level understanding of digital images or videos. It seeks to understand and automate tasks that the human visual system can do. This course provides a basic introduction using the Python programming language to vision for practitioners, focusing on the implementation of image processing filters, feature analyzers, and flow processing, with extensive emphasis on the state of the field in convolutional neural networks (ConvNets) and current deep learning methods.
Prerequisites
- MATH 220 (Analytic Geometry and Calculus I) or MATH 205 (General Calculus and Linear Algebra)
- CIS 111 (Introduction to Programming) or CIS 200 (Fundamentals of Programming) or CIS 209 (Python Programming for Engineers) or CC 210 (Fundamental Computer Programming Concepts)
Textbook
Dive into Deep Learning by Zhang, Lipton, Li, & Smola - required, online (at http://d2l.ai, 2019-present)
Selected readings and references
Computer Vision: A Modern Approach, 2nd edition by Forsyth and Ponce (2012)
Other requirements
Students will be given access to GPU servers and initially use Google Colab for PyTorch-based programming exercises. However, a personal computer is strongly recommended and a desktop PC with a GPU (Nvidia GTX 2080 or better) is suggested.
Syllabus
- Image formation / projective geometry/lighting
- Practical linear algebra - transformations (rotation, translation, scaling); affine and projective
- Basic image processing operations - filters, features, and flow; image transformations
- Object recognition - sliding windows and object proposals
- Basic biology of vision - retina, visual cortex (especially V1), and color
- Features and histograms - SIFT, SURF, GLOW, HOG
- Introduction to differentiable computing - artificial neural networks and gradients
- Convolutional Neural Network (ConvNet) based approaches to visual recognition of objects and scenes
- ConvNet advances - from AlexNet to ResNet to GANs, current efficient, secure, and few-shot/zero-shot methods
- Attributes, pose and actions
- Contours and segmentation
- Geometry - single and multi-view, 3-D reconstructions
- Applications
- Ethical considerations - problematic use cases (face recognition, etc.), bias; responsible, secure, and trustworthy use
- Intro to video analysis - sequences, motion, flow
Details
This course is predominantly patterned after Stanford CS231n, is heavily influenced by Berkeley's CS 280, and borrows a small amount of material from CS 543 at the University of Illinois., Carnegie Mellon University's 16-385, and the University of Michigan's EECS 442. For more information, please contact the instructor at bhsu@ksu.edu.
Last updated by rotclanny on Aug 18, 2023