ICCV 2019 Tutorial on

Visual Recognition for Images, Video, and 3D

Location: Auditorium

Monday, October 28 (full day), 2019


Haoqi Fan

Nikhila Ravi
Wan-Yen Lo


The purpose of this tutorial is to discuss popular approaches and recent advancements in the family of visual recognition tasks for different input modalities. We will cover in detail the most recent work on object recognition and scene understanding. Going beyond single images we will show current progress in video (detection and classification in video) and 3D visual recognition (multi-object mesh prediction). Our goal is to show existing connections between the techniques specialized for different input modalities and provide some insights about diverse challenges that each modality presents.

In conjunction with the tutorial we are open-sourcing three new visual recognition systems for images, videos, and 3D respectively. These PyTorch-based systems contain multiple state-of-the-art methods in the corresponding domains. In our tutorial we will pair each research talk with a talk that discusses these codebases sharing best engineering practices and showing details of implementation for each domain. We hope that such pairing will help researchers who are interested primarily in visual recognition to build and benchmark their systems easier. For researchers from different areas we hope to make SOTA recognition systems easy to incorporate in their frameworks.


09:00 - 09:15 Overview of Visual Recognition Research and OSS (slides) - Wan-Yen Lo

09:15 - 10:00 On Connectivity for Representation Learning (slides) - Saining Xie

10:00 - 10:45 Coffee Break

10:45 - 11:45 Object Detection and Instance Segmentation (slides) - Ross Girshick

11:45 - 12:30 Detectron 2 (slides) - Yuxin Wu

12:30 - 14:00 Lunch Break

14:00 - 14:45 Mesh Reconstruction in the Wild (slides) - Georgia Gkioxari / Justin Johnson

14:45 - 15:30 Torch3d & Mesh R-CNN (slides) - Nikhila Ravi / Georgia Gkioxari

15:30 - 16:00 Coffee Break

16:00 - 16:45 Recognition in Video (slides) - Christoph Feichtenhofer

16:45 - 17:15 Video Recognition Codebase (slides) - Haoqi Fan

Contact: Alexander Kirillov