Media Summary: [CVPR 2026 Highlight] Towards Multimodal Domain Generalization with Few Labels How much do video diffusion models know about the 4D world? By introducing a 4D VAE, we jointly estimate Title: Scene-Centric Unsupervised Video Panoptic Segmentation Authors: Christoph Reich*, Oliver Hahn*, Nikita Araslanov, ...

Cvpr 2026 Generalizing Visual Geometry - Detailed Analysis & Overview

[CVPR 2026 Highlight] Towards Multimodal Domain Generalization with Few Labels How much do video diffusion models know about the 4D world? By introducing a 4D VAE, we jointly estimate Title: Scene-Centric Unsupervised Video Panoptic Segmentation Authors: Christoph Reich*, Oliver Hahn*, Nikita Araslanov, ... [CVPR 2026 Highlight] Visual-RRT: Finding Paths toward Visual-Goals via Differentiable Rendering Disentangle-then-Align: Non-Iterative Hybrid Multimodal Image Registration via Cross-Scale Feature Disentanglement. MERL researcher Pedro Miraldo presents the paper “Revisiting Monocular SLAM with Spatio-Temporal Scene Modeling” at the ...

[CVPR 2026] OddGridBench: Exposing the Lack of Fine-Grained Visual Discrepancy Sensitivity in MLLMs Authors: Matteo Ballegeer, Dries F. Benoit Paper: Google Scholar: ... Chengxing Lin, Jinhong Deng, Yinjie Lei, Wen Li. "Deformation-based In-Context Learning for Point Cloud Understanding. Omni-Attribute encodes a high-fidelity, attribute-specific image representation, that enables coherent synthesis of the ... Hakyeong Kim, Ruicheng Wang, Chengtang Yao, Jiaolong Yang, Min H. Kim ( [CVPR 2026] VGent: Visual Grounding via Modular Design for Disentangling Reasoning and Prediction

Photo Gallery

[CVPR 2026] Generalizing Visual Geometry Priors to Sparse Gaussian Occupancy Prediction
[CVPR 2026 Highlight] Towards Multimodal Domain Generalization with Few Labels
[CVPR 2026] MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE
[CVPR 2026] Scene-Centric Unsupervised Video Panoptic Segmentation
[CVPR 2026] CarlaOcc
[CVPR 2026 Highlight] Visual-RRT: Finding Paths toward Visual-Goals via Differentiable Rendering
[CVPR 2026]
[CVPR 2026] Revisiting Monocular SLAM with Spatio-Temporal Scene Modeling
Beyond Geometry: Artistic Disparity Synthesis for Immersive 2D-to-3D | CVPR 2026
[CVPR 2026] OddGridBench: Exposing the Lack of Fine-Grained Visual Discrepancy Sensitivity in MLLMs
[CVPR 2026] VAD-GS
[CVPR 2026] FoV-Net: Rotation-Invariant CAD B-rep Learning via Field-of-View Ray Casting
View Detailed Profile
[CVPR 2026] Generalizing Visual Geometry Priors to Sparse Gaussian Occupancy Prediction

[CVPR 2026] Generalizing Visual Geometry Priors to Sparse Gaussian Occupancy Prediction

CVPR 2026

[CVPR 2026 Highlight] Towards Multimodal Domain Generalization with Few Labels

[CVPR 2026 Highlight] Towards Multimodal Domain Generalization with Few Labels

[CVPR 2026 Highlight] Towards Multimodal Domain Generalization with Few Labels

[CVPR 2026] MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE

[CVPR 2026] MotionCrafter: Dense Geometry and Motion Reconstruction with a 4D VAE

How much do video diffusion models know about the 4D world? By introducing a 4D VAE, we jointly estimate

[CVPR 2026] Scene-Centric Unsupervised Video Panoptic Segmentation

[CVPR 2026] Scene-Centric Unsupervised Video Panoptic Segmentation

Title: Scene-Centric Unsupervised Video Panoptic Segmentation Authors: Christoph Reich*, Oliver Hahn*, Nikita Araslanov, ...

[CVPR 2026] CarlaOcc

[CVPR 2026] CarlaOcc

CVPR 2026

[CVPR 2026 Highlight] Visual-RRT: Finding Paths toward Visual-Goals via Differentiable Rendering

[CVPR 2026 Highlight] Visual-RRT: Finding Paths toward Visual-Goals via Differentiable Rendering

[CVPR 2026 Highlight] Visual-RRT: Finding Paths toward Visual-Goals via Differentiable Rendering

[CVPR 2026]

[CVPR 2026]

Disentangle-then-Align: Non-Iterative Hybrid Multimodal Image Registration via Cross-Scale Feature Disentanglement.

[CVPR 2026] Revisiting Monocular SLAM with Spatio-Temporal Scene Modeling

[CVPR 2026] Revisiting Monocular SLAM with Spatio-Temporal Scene Modeling

MERL researcher Pedro Miraldo presents the paper “Revisiting Monocular SLAM with Spatio-Temporal Scene Modeling” at the ...

Beyond Geometry: Artistic Disparity Synthesis for Immersive 2D-to-3D | CVPR 2026

Beyond Geometry: Artistic Disparity Synthesis for Immersive 2D-to-3D | CVPR 2026

This video presents our

[CVPR 2026] OddGridBench: Exposing the Lack of Fine-Grained Visual Discrepancy Sensitivity in MLLMs

[CVPR 2026] OddGridBench: Exposing the Lack of Fine-Grained Visual Discrepancy Sensitivity in MLLMs

[CVPR 2026] OddGridBench: Exposing the Lack of Fine-Grained Visual Discrepancy Sensitivity in MLLMs

[CVPR 2026] VAD-GS

[CVPR 2026] VAD-GS

CVPR 2026

[CVPR 2026] FoV-Net: Rotation-Invariant CAD B-rep Learning via Field-of-View Ray Casting

[CVPR 2026] FoV-Net: Rotation-Invariant CAD B-rep Learning via Field-of-View Ray Casting

Authors: Matteo Ballegeer, Dries F. Benoit Paper: https://arxiv.org/abs/2602.24084 Google Scholar: ...

[CVPR 2026] Deformation-based In-Context Learning for Point Cloud Understanding

[CVPR 2026] Deformation-based In-Context Learning for Point Cloud Understanding

Chengxing Lin, Jinhong Deng, Yinjie Lei, Wen Li. "Deformation-based In-Context Learning for Point Cloud Understanding.

OccAny: Generalized Unconstrained Urban 3D Occupancy | CVPR 2026

OccAny: Generalized Unconstrained Urban 3D Occupancy | CVPR 2026

OccAny is the first

ConceptOT: The Geometry of Vision Language Alignment (CVPR, VisCon 2026)

ConceptOT: The Geometry of Vision Language Alignment (CVPR, VisCon 2026)

Video Overview of CVRP, VisCon

CVPR 2026 MaskDiME

CVPR 2026 MaskDiME

This video briefly introduces our

[CVPR 2026] Omni-Attribute - Technical Presentation

[CVPR 2026] Omni-Attribute - Technical Presentation

Omni-Attribute encodes a high-fidelity, attribute-specific image representation, that enables coherent synthesis of the ...

[CVPR 2026] Dense Metric Depth Completion from Sparse Direct Time-of-Flight Sensors

[CVPR 2026] Dense Metric Depth Completion from Sparse Direct Time-of-Flight Sensors

Hakyeong Kim, Ruicheng Wang, Chengtang Yao, Jiaolong Yang, Min H. Kim (

[CVPR 2026] VGent: Visual Grounding via Modular Design for Disentangling Reasoning and Prediction

[CVPR 2026] VGent: Visual Grounding via Modular Design for Disentangling Reasoning and Prediction

[CVPR 2026] VGent: Visual Grounding via Modular Design for Disentangling Reasoning and Prediction