From Pixels to Measurements: Understanding the Dynamic World

Speaker: Dr. Adam Harley

From: Stanford University

Abstract

In computer vision, “video understanding” typically concerns summarization: tracking the main objects, or describing the main actions. While progress here has been impressive, many practical applications require extracting information which is much more fine-grained. For example, biologists are highly interested in tracking specific key points of organisms in long video recordings. Algorithms for such tasks require the generality and precision of low-level vision methods (e.g., optical flow), but benefit from knowledge about the 3D physical world (e.g., things continue to exist while they are occluded). In this talk, I will present our progress on this crucial space of problems. Our central contribution is to widen the window of “temporal context” used for inference: instead of tracking entities from one frame to the next, we inspect dozens of frames simultaneously, and return an answer that makes sense for the full clip. I will discuss the methods and datasets that we have created to drive progress along these lines, and highlight natural science applications of the work. Finally, I will introduce our ongoing effort to produce a “foundation model” of motion, aiming to deliver reliable arbitrary-granularity tracking for the huge variety of real-world situations where this is required.

For more info, please follow this link.

Locations:

Research 1: 101-A [ View Website ]

Virtual [ Open Virtual Location Link ]

Contact:

Cherry Place cherry@crcv.ucf.edu

Calendar:: CRCV
Category:: Academic
Tags:: UCFCRCV

Admin Options

Locations:

Contact:

Calendar:

Category:

Tags: