How can robots perceive the world and their own motion so that they can accomplish navigation and manipulation tasks? In this course, we will study how this age-old question has been approached specifically if the robot is equipped with visual sensing capabilities. We focus on studying how a robot can make decisions based on raw, high-dimensional sensory data that contains only partial, noisy observations of the hidden state of the environment. Therefore, the course is divided into two main themes (i) Estimation and (ii) Decision-Making and Control where in each theme we will study traditional approaches, learning-based methods and combinations of those.

For estimation, we will study how images and videos acquired by cameras mounted on robots are transformed into representations like features and optical flow. Such 2D representations allow us then to extract 3D information about where the camera is, in which direction the robot moves, or how objects in front of the camera are moved. We will also study linear and nonlinear filtering methods that allow the robot to keep track of itself, other objects and the environment state over time. And we will study how these problems can be formulated as learning problems to mitigate some of the biases introduced by traditional methods.

For decision-making and control, we will study the basic formulations of optimal and adaptive control and relate it to learning-based approaches such as model-based and model-free reinforcement learning. We will apply what we learned to vision-based control methods, i.e. image-based and position-based visual servoing. We will also study how learning from demonstration or imitation learning can mitigate some of the sampling problems in RL.

We bridge these two themes of estimation and decision-making by also studying optimal representations of raw sensory data and how they can be learned such that they are optimal for a task. We will also discuss how perception and action influence each other.

Both for estimation and decision-making, we draw connections to what we know about human and animal intelligence from cognitive science. We will observe similarities and differences between artificial and biological systems and revisit the most relevant experiments and explanations for human perception and decision making.

Mengyuan Yan

Michelle Lee

Lecture: Monday, Wednesday 3:00pm-4:20pm

Gates Building, B12

Assignment #2: 15%

Assignment #3: 15%

Midterm: 20%

Course Project: 35%

See the Grading Policy for more details.

Gradescope Code: M7KJ88.

- Proficiency in Python, high-level familiarity in C/C++

All class assignments will be in Python (and use numpy and later PyTorch). Here is a tutorial for those who aren't as familiar with Python. Some of the learning or control libraries we may look at later in the class are written in C++. If you have a lot of programming experience but in a different language (e.g. C/C++/Matlab/Javascript) you will probably be fine. - College Calculus, Linear Algebra (e.g. MATH 19 or 41, MATH 51, CME100)

You should be comfortable taking (multivariable) derivatives and understanding matrix/vector notation and operations. - Basic Probability and Statistics (e.g. CS 109 or other stats course)

You should know basics of probabilities, gaussian distributions, mean, standard deviation, etc. - Foundations of Machine Learning (e.g. CS229 )

We will be formulating cost functions, taking derivatives and performing optimization with gradient descent. If you already have basic machine learning and/or deep learning, the course will be easier; however it is possible to take CS336 without it. There are many introductions to ML, in webpage, book, and video form. One approachable introduction is Hal Daumé's in-progress A Course in Machine Learning. Reading the first 5 chapters of that book would be good background. Knowing the first 7 chapters would be even better! - Optional: Foundations in Optimal Estimation, Control and RL (e.g. AA 273, AA 203, CS234, CS238)

We will be implementing Bayesian filters and optimal controllers to include them as inductive biases in learning-based approaches. If you have already taken courses like e.g. AA 273 or AA 203, CS234 or CS238, some parts of this course will be easier for you. But it is possible to take this course without having taken these courses. This course on robot perception and learning is focussed on making decisions based on raw, high-dimensional sensory data that represent only partial, noisy observations of the environment.

- Slides

We provide links to the slides for most of the classes. However, the slides won't contain the full course: derivations and other parts of the classes will be instructed live. - Course Notes

We do not provide course notes but there is a student initiative to generate them and it is published here. - Computer Vision

We follow the notation and conventions of the fantastic Hartley and Zisserman book. Stanford provides an online copy here - Recursive Estimation and Robotics

We will follow the conventions of Probabilistic Robotics by Thrun, Burgard and Fox. Great book! For some more advanced derivations in the context of robotics and 6D motion we recommend State Estimation for Robotics by Barfoot, available here (notation can be different to the ours)

Can I take this course on credit/no cred basis?

Yes. Credit will be given to those who would have otherwise earned a C- or above.

Can I audit or sit in?

In general we are very open to sitting-in guests if you are a member of the Stanford community (registered student, staff, and/or faculty). Out of courtesy, we would appreciate that you first email us or talk to the instructor after the first class you attend. If the class is too full and we're running out of space, we would ask that you please allow registered students to attend.

Can I work in groups for the Final Project?

Yes, in groups of up to two people.

I have a question about the class. What is the best way to reach the course staff?

Almost all questions should be asked on Piazza. If you have a sensitive issue you can email the instructors directly.

Can I combine the Final Project with another course?

Yes, you may; however before doing so you must receive permission from the instructors of both courses.