Our team of experts is ready to answer!
You can contact us directly
Telegram iconFacebook messenger iconWhatApp icon
Fill in the form below and you will receive an answer within 2 working days.
Or fill in the form below and you will receive an answer within 2 working days.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Reading Time
3 Minutes
Daniil Osokin
Developer at OpenCV.ai
Revolutionizing Object Tracking: "Tracking Everything Everywhere All at Once" Paper Review

Tech track #1. "Tracking Everything Everywhere All at Once" review

The "Tracking Everything Everywhere All at Once" paper, a collaborative work by Cornell University, Google Research, and UC Berkeley, offers a breakthrough solution to the problem of tracking any point in video footage. The method maps each pixel from every frame into a common 3D space, tracing its trajectory across time. This method is not designed for real-time tracking but for in-depth recorded video analysis. In this article, we highlight the most exciting points of this paper.
June 15, 2023

Introduction

Last week Cornell University, Google Research, and UC Berkeley published a paper with an intriguing title "Tracking Everything Everywhere All at Once". It proposes a solution for the track any point problem. As the name hints, it tracks everything everywhere - meaning pixels across all frames, even if they are occluded. See the beautiful visualizations from the paper’s OmniMotion website:

objects trajectory tracking

So, if you select a specific pixel, you can find its coordinates on all previous frames, as well as on all next frames. That is fantastic! However, we believe (while no code is available) this pixel should be a good feature to track, e.g. corner, not from single-color or low-textured areas.

Two things to note:

The first thing is that this algorithm is intended to work on the whole video all at once. It runs an optimization process given all video frames, so it is not designed for real-time tracking. However, it might be useful for the analysis of surveillance videos or sports analytics.

Second thing: it needs to run an external algorithm for supervision to perform the track optimization process. The authors compute the RAFT optical flow for all frame pairs before the optimization.

Idea

The authors propose to map each object pixel from every frame to a single common 3D space (the canonical 3D volume G in the paper). Thus one point in this space corresponds to a 3D trajectory across time. They use two algorithms to create this 3D space:

NeRF to model the volume density and color prediction.

Invertible neural networks  capture the camera parameters and scene motion from different frames.

The algorithm overview is shown in the picture below:

Two algorithms to create the 3D space

The loss functions are flow loss, photometric loss, and penalty for large displacement between consecutive 3D point locations to ensure temporal smoothness of the 3D motion. OmniMotion compares tracking results on the target TAP-Vid benchmark and shows an impressive improvement.

Object tracking remains one of the hard computer vision problems. This paper significantly advances the state-of-the-art using complex but elegant algorithms. It is a great research paper - and we are looking forward to trying it live!

Paper: https://arxiv.org/pdf/2306.05422.pdf

Let's discuss your project

Book a complimentary consultation

Read also

April 12, 2024

Digest 19 | OpenCV AI Weekly Insights

Dive into the latest OpenCV AI Weekly Insights Digest for concise updates on computer vision and AI. Explore OpenCV's distribution for Android, iPhone LiDAR depth estimation, simplified GPT-2 model training by Andrej Karpathy, and Apple's ReALM system, promising enhanced AI interactions.
April 11, 2024

OpenCV For Android Distribution

The OpenCV.ai team, creators of the essential OpenCV library for computer vision, has launched version 4.9.0 in partnership with ARM Holdings. This update is a big step for Android developers, simplifying how OpenCV is used in Android apps and boosting performance on ARM devices.
April 4, 2024

Depth estimation Technology in Iphones

The article examines the iPhone's LiDAR technology, detailing its use in depth measurement for improved photography, augmented reality, and navigation. Through experiments, it highlights how LiDAR contributes to more engaging digital experiences by accurately mapping environments.