CVPR 2025 - Nashville, United States

C3DV: 3rd Workshop on Compositional 3D Vision

Third workshop on compositional 3D vision and 3DCoMPaT dataset challenges, hosted by #CVPR2025.

000

Days

00

Hours

00

Minutes

00

Seconds

Program

Workshop Program

Join us for a day of exciting talks and discussions on the latest advances in computer vision and AI.

08:15

Opening Remarks08:15 AM - 08:25 AM

#

Opening Remarks

  • Organizers

08:25

Invited Talk 108:25 AM - 08:55 AM

Andrea Vedaldi

Andrea Vedaldi Talk

  • Andrea Vedaldi
Bio

Pr. Vedaldi is a Professor of Computer Vision and Machine Learning and a co-lead of the VGG group at the Engineering Science department of the University of Oxford. His research mainly covers using computer vision and machine learning methods to understand the content of images and videos automatically, with little to no manual supervision, in terms of semantics and 3D geometry. He is also the leading author of the VLFeat and MatConvNet computer vision and deep learning libraries.

08:55

Invited Talk 208:55 AM - 09:25 AM

Vladimir Kim

Vova Kim Talk

  • Vladimir (Vova) Kim
Bio

Vladimir (Vova) Kim is a Senior Research Scientist at Adobe Research, where his work focuses on computer vision, machine learning, and 3D geometry processing. His research spans a wide range of topics, including 3D reconstruction, shape analysis, and generative models for 3D content creation. Vova is well-known for his contributions to developing advanced algorithms that bridge the gap between visual understanding and 3D modeling, helping to shape the future of creative technologies. He holds a Ph.D. from Princeton University, and his work has been widely published in top-tier conferences such as CVPR and SIGGRAPH.

09:25

Coffee Break09:25 AM - 09:45 AM

09:45

Invited Talk 309:45 AM - 10:15 AM

Tolga Birdal

Tolga Birdal Talk

  • Tolga Birdal
Bio

Pr. Birdal is an Assistant Professor (Lecturer) in the Department of Computing of Imperial College London. Previously, he was a senior Postdoctoral Research Fellow at Stanford University within the Geometric Computing Group of Pr. Leonidas Guibas. His current foci of interest involve geometric machine learning and 3D computer vision. More theoretical work is aimed at investigating and interrogating limits in geometric computing and non-Euclidean inference as well as principles of deep learning.

10:15

Invited Talk 410:15 AM - 10:45 AM

Hao Su

Hao Su Talk

  • Hao Su
Bio

Pr. Su is an Associate Professor in the Department of Computer Science and Engineering at the University of California, San Diego. His research focuses on artificial intelligence, 3D vision, and robotics, with a particular emphasis on deep learning for 3D understanding, 3D reconstruction, and robot learning. He has made significant contributions to the development of neural representations for 3D data, advancing fields such as 3D shape analysis and scene understanding.

10:45

Invited Talk 510:45 AM - 11:15 AM

Angel Chang

Angel Chang Talk

  • Angel Chang
Bio

Pr. Chang is an Associate Professor in the School of Computing Science at Simon Fraser University. Prior to this, she was a visiting research scientist at Facebook AI Research and a research scientist at Eloquent Labs. Her research focuses on bridging the gap between language and 3D representations of shapes and scenes, grounding language for embodied agents, and synthesizing 3D environments from natural language.

11:15

Lunch Break11:15 AM - 12:15 PM

12:15

Invited Talk 612:15 PM - 12:45 PM

Yi Fang

Yi Fang Talk

  • Yi Fang
Bio

Pr. Fang is an Associate Professor of Electrical and Computer Engineering at the NYU Abu Dhabi and NYU Tandon. He directs the NYU Multimedia and Visual Computing Lab. His research focuses on 3D Computer Vision and Machine Learning with applications to robotics and autonomous driving. He is currently working on the development of 3D deep learning technologies in large-scale visual computing, cross-domain and cross-modality models, and their various industrial applications.

12:45

Closing Remarks12:45 PM - 13:00 PM

#

Closing Remarks

  • Organizers

Challenges

3DCoMPaT-200 Challenge



🔍 Challenge overview


GCR is a 3D vision task for recognizing material-part compositions on 3D objects using the 3DCoMPaT dataset. We offer two variants GCR-Coarse and GCR-Fine with different segmentation granularity, plus a Language-Based Part Grounding challenge where models segment parts from text prompts.
Evaluation uses metrics including Shape Accuracy and Grounded-value-all. Both challenges run February-May 2025, with results announced in June.

We encourage participation in both tracks.

📊 Dataset


The 3DCoMPaT dataset for both challenge tracks is available through our download page.

📨 Submission


Submission will be made through the eval-ai platform, for the Part-Segmentation Challenge and Language Grounding Challenge.

📜 Rules


Here are the rules for the challenge:

  • Submission Limit: Each participant is allowed to submit their solution a maximum of three times per day.
  • Data Usage: Participants are not permitted to use any data other than the 3DCoMPaT data for training their models.
  • Technical Report: Each participant must submit a technical report detailing their methods, which will be made public, in order to be eligible for any prizes or rewards.


🏆 Awards


Total prize pool: 1500$. Teams are encouraged to particpate to both challenge tracks.
Fine track:
  • 1st: 500$
  • 2nd: 250$
Coarse track:
  • 1st: 500$
  • 2nd: 250$

These prizes are designed to motivate participants to put their best effort into the challenge and to reward those who perform exceptionally well. The challenge organizers hope that these prizes will encourage a high level of participation and help to drive innovation in the field of 3D computer vision. It should be noted that eligibility for these prizes is contingent on participants adhering to the rules of the challenge. Therefore, participants must submit their solutions in accordance with the rules and provide a technical report detailing their methods to be considered for any prizes or rewards.

💬 Q&A


If you encounter any technical issue related to the challenge, or if you're missing critical information, please open a ticket on our GitHub repository.


🎉 2023 Winning Solution


We share below the previous year's solution winner, and her winning solution repository below:


Cattalya's repository


Challenges

3DCoMPaT-200 Challenges

3DCoMPaT-200 Part Grounding Challenge


Grounded CoMPaT Recognition (GCR). Given an input shape, here: a chair, the task consists of (a) recognizing the shape category and (b) segmenting the part-material pairs composing it.


The Grounded CoMPaT Recognition (GCR) is a compositional 3D Vision task that aims to collectively recognize and ground compositions of materials on parts of 3D objects. We will organize two variations of this task and adapt state-of-art multiview 2D and 3D deep learning methods to solve the problem. A documentation describing the 3DCoMPaT-200 dataset and the GCR task can be found here.


Evaluation

Inspired by the metrics proposed in [Yatskar2016, Pratt2016] for compositional situation recognition of activities in images, we define the compositional metrics of the 2D/3D Grounded CoMPaT Recognition (GCR) task as follows:

  • (a) Shape Accuracy: accuracy of the predicted shape category.
  • (b) Value: accuracy of predicting both part category and the material of a given part correctly.
  • (c) Value-all: accuracy of predicting all the (part, material) pairs of a shape correctly.
  • (d) Grounded-value: accuracy of predicting both part category and the material of a given part as well as correctly grounding it.
  • (e) Grounded-value-all: accuracy of predicting all the (part, material) pairs of a given shape correctly and grounding all of them correctly.

All these metrics are calculated for each shape and then averaged across them to avoid bias toward shapes with more parts. Given the shape dependence of metrics, we define three settings:

  • (a) Ground Truth Shape: the ground truth shape is assumed to be correct.
  • (b) Top-1 Shape: Shape category is predicted correctly.
  • (c) Top-5 Shape: Shape category is in the top-5 predictions.

For (b) and (c), part-material pairs and their groundings are considered incorrect if the shape is not in top-1 or top-5 predictions, respectively.


Challenge timeline

We propose the following tentative timeline for the 3DCoMPaT-200 challenge:


  • Start: March 1, 2025
  • Submission Deadline: May 30, 2025
  • Decision: June 12, 2025


3DCoMPaT-200 Language-Based Part Grounding


Challenge description

Task: Part Grounding: Given text prompts referring to one or more parts in a shape, participants will design a model to segment the mentioned parts in the shape's point cloud. The challenge offers various levels of difficulty, with participants having access to grounding prompts with different numbers of parts per shape.


Challenge timeline

We propose the following tentative timeline for the Language-Based Part Grounding challenge:


  • Start: March 1, 2025
  • Submission Deadline: May 30, 2025
  • Decision: June 12, 2025

Paper submission

Call for Papers

🎯 Paper Submissions


We invite researchers to submit their work on compositional 3D vision for the C3DV workshop. Selected papers will be presented during the workshop in poster and oral sessions.

More information on paper submission and presentation will be announced soon.

🦜 Topics


Besides the 3DCoMPaT and 3DCoMPaT-200 challenges, the C3DV workshop also accepts papers in relation with compositional 3D vision. The workshop will include a poster and an oral session for related works. Topics of this workshop include but are not limited to:

  • Deep learning methods for compositional 3D vision
  • Self-supervised learning for compositional 3D vision
  • Visual relationship detection in 3D scenes
  • Zero-shot recognition/detection of compositional 3D visual concepts
  • Novel problems in 3D vision and compositionality
  • Text/composition to 3D generation
  • Text/composition-based editing of 3D scenes/objects
  • Language-guided 3D visual understanding (objects, relationships, ...)
  • Transfer learning for compositional 3D Vision
  • Multimodal pre-training for 3D understanding
  • Composition-based 3D object/scene search/retrieval
  • Compositional 3D vision aiding language problems
  • ...

The submitted 4-page abstracts will be peer-reviewed in CVPR format. Abstracts will be presented in the workshop poster session, and a portion of the accepted papers will be orally presented.



📨 Submission


Paper submissions will be handled with CMT through the following link:

(available soon.)

Please select the appropriate track (archival or non-archival) and check for the relevant timelines in the dates section.

Speakers

Invited Speakers

Hao Su

Associate Professor UC San Diego

Tolga Birdal

Assistant Professor Imperial College London

Vladimir Kim

Senior Research Scientist Adobe Research

Georgia Gkioxari

Assistant Professor University of Southern California

Andrea Vedaldi

Professor University of Oxford

Angel Chang

Associate Professor Simon Fraser University

Yi Fang

Associate Professor NYU Abu Dhabi & NYU Tandon

Organizers

Workshop Organizers

Habib Slim

Ph.D. Student KAUST

Mahmoud Ahmed

Ph.D. Student KAUST

Abdulwahab Felemban

Ph.D. Student KAUST

Wolfgang Heidrich

Professor KAUST

Peter Vajda

Researcher and Engineering Manager Meta AI

Natalia Neverova

Research Lead Meta AI

Mohamed Elhoseiny

Assistant Professor KAUST

Challenge Organizers

Junjie Fei

Ph.D. Student KAUST

Mahmoud Ahmed

Research Student KAUST

Xiang Li

Postdoctoral Researcher KAUST

Peter Wonka

Professor KAUST

Mohamed Elhoseiny

Assistant Professor KAUST

Dates

Timeline

Non-archival track:
Event Date
Paper submission deadline May 15th
Notification to authors May 20th
Camera-ready deadline May 31th
Workshop date June 18th
3DCoMPaT-200 Challenge:
Event Date
Release of training/validation data March. 1st
Test server online March. 20th
Submission deadline May 30th
Fact sheets/source code submission deadline June 10th
Winners announcement June 14th

For any question or support, please reach @Habib.S.