KSU KDD Wiki: courses-vision

Computer Vision - CIS 798 X (Spring, even years)

CIS 798: Introduction to Computer Vision

First offering: Fall, 2022

Credit hours: 3

Course Description

Computer vision is an interdisciplinary topic area that deals with how to develop systems and particularly algorithms for image formation and processing to gain a high-level understanding of digital images or videos. It seeks to understand and automate tasks that the human visual system can do. This course provides a basic introduction using the Python programming language to vision for practitioners, focusing on the implementation of image processing filters, feature analyzers, and flow processing, with extensive emphasis on the state of the field in convolutional neural networks (ConvNets) and current deep learning methods.

Prerequisites

MATH 220 (Analytic Geometry and Calculus I) or MATH 205 (General Calculus and Linear Algebra)
CIS 111 (Introduction to Programming) or CIS 200 (Fundamentals of Programming) or CIS 209 (Python Programming for Engineers) or CC 210 (Fundamental Computer Programming Concepts)

Textbook

Dive into Deep Learning by Zhang, Lipton, Li, & Smola - required, online (at http://d2l.ai, 2019-present)

Selected readings and references

Computer Vision: A Modern Approach, 2nd edition by Forsyth and Ponce (2012)

Other requirements

Students will be given access to GPU servers and initially use Google Colab for PyTorch-based programming exercises. However, a personal computer is strongly recommended and a desktop PC with a GPU (Nvidia GTX 2080 or better) is suggested.

Syllabus

Image formation / projective geometry/lighting
Practical linear algebra - transformations (rotation, translation, scaling); affine and projective
Basic image processing operations - filters, features, and flow; image transformations
Object recognition - sliding windows and object proposals
Basic biology of vision - retina, visual cortex (especially V1), and color
Features and histograms - SIFT, SURF, GLOW, HOG
Introduction to differentiable computing - artificial neural networks and gradients
Convolutional Neural Network (ConvNet) based approaches to visual recognition of objects and scenes
ConvNet advances - from AlexNet to ResNet to GANs, current efficient, secure, and few-shot/zero-shot methods
Attributes, pose and actions
Contours and segmentation
Geometry - single and multi-view, 3-D reconstructions
Applications
Ethical considerations - problematic use cases (face recognition, etc.), bias; responsible, secure, and trustworthy use
Intro to video analysis - sequences, motion, flow

Details

This course is predominantly patterned after Stanford CS231n, is heavily influenced by Berkeley's CS 280, and borrows a small amount of material from CS 543 at the University of Illinois., Carnegie Mellon University's 16-385, and the University of Michigan's EECS 442. For more information, please contact the instructor at bhsu@ksu.edu.

Last updated by rotclanny on Aug 18, 2023

Wiki Contents