CVPR 2024 - Seattle, United States

C3DV: 2nd Workshop on Compositional 3D Vision

Second workshop on compositional 3D vision and VSIC and 3DCoMPaT++ dataset challenges, hosted by #CVPR2024.

zoom Zoom Live Stream Link

000

Days

00

Hours

00

Minutes

00

Secondes

Program

Workshop Program

All invited talks, oral presentations and the panel discussion will take place in Summit 327 of the Seattle Convention Center.
CVPR virtual conference link.
The workshop will happen June 18th 2024, from 8:50 AM to 6:00 PM (Seattle time).

08:50

Opening Remarks08:50 AM - 09:00 AM

09:00

Invited Talk 109:00 AM - 09:35 AM

Andrea Vedaldi is Professor of Computer Vision and Machine Learning at the University of Oxford, where he co-leads the Visual Geometry Group since 2012. His current work focuses on generative AI for 3D computer vision, for generating 3D objects from text and images as well as the basis for image understanding. He is author of more than 200 peer-reviewed publications in the top machine vision and artificial intelligence conferences and journals. He is a recipient of the Mark Everingham Prize for selfless contributions to the computer vision community, the Test of Time Award by the ACM for his open source software contributions, and the best paper award from the Conference on Computer Vision and Pattern Recognition.

09:35

Invited Talk 209:35 AM - 10:10 AM

Professor Katerina Fragkiadaki is an Assistant Professor in the Machine Learning Department at Carnegie Mellon University. Her research interests lie in building machines that can understand the stories portrayed in videos, and conversely, using videos to teach machines about the world. The pen-ultimate goal of her work is to build a machine that comprehends movie plots, while the ultimate goal is to develop a machine that would prefer watching the films of Ingmar Bergman over other options. Prior to joining the faculty at Carnegie Mellon's Machine Learning Department, Professor Fragkiadaki spent three years as a postdoctoral researcher, first at UC Berkeley working with Jitendra Malik, and then at Google Research in Mountain View, where she worked with the video group. She completed her Ph.D. in the GRASP (General Robotics, Automation, Sensing & Perception) program at the University of Pennsylvania under the guidance of Jianbo Shi. Her undergraduate studies were undertaken at the National Technical University of Athens, and before that, she was in Crete. Professor Fragkiadaki's academic journey and research interests revolve around developing machines that can comprehend and interpret the narratives conveyed through videos, ultimately aiming to create artificial intelligence systems that can appreciate and engage with complex artistic works like the films of acclaimed director Ingmar Bergman.

10:10

Coffee & Poster session10:10 AM - 10:40 AM

10:45

Invited Talk 310:45 AM - 11:20 AM

Minhyuk Sung is an assistant professor in the School of Computing at KAIST, affiliated with the Graduate School of AI and the Graduate School of Metaverse. Before joining KAIST, he was a Research Scientist at Adobe Research. He received his Ph.D. from Stanford University under the supervision of Professor Leonidas J. Guibas. His research interests lie in vision, graphics, and machine learning, with a focus on 3D geometric data generation, processing, and analysis. His academic services include serving as a program committee member in SIGGRAPH Asia 2022, 2023, and 2024, Eurographics 2022 and 2024, Pacific Graphics 2023, and AAAI 2023 and 2024.

11:20

Invited Talk 411:20 AM - 11:55 AM

Jiajun Wu is an Assistant Professor of Computer Science and, by courtesy, of Psychology at Stanford University, working on computer vision, machine learning, and computational cognitive science. Before joining Stanford, he was a Visiting Faculty Researcher at Google Research. He received his PhD in Electrical Engineering and Computer Science from the Massachusetts Institute of Technology. Wu's research has been recognized through the Young Investigator Programs (YIP) by ONR and by AFOSR, the NSF CAREER award, paper awards and finalists at ICCV, CVPR, SIGGRAPH Asia, CoRL, and IROS, dissertation awards from ACM, AAAI, and MIT, the 2020 Samsung AI Researcher of the Year, and faculty research awards from J.P. Morgan, Samsung, Amazon, and Meta.

12:00

Lunch break12:00 PM - 01:00 PM

01:00

Invited Talk 501:00 PM - 01:35 PM

Srinath Sridhar is an assistant professor of computer science at Brown University. He received his PhD at the Max Planck Institute for Informatics and was subsequently a postdoctoral researcher at Stanford. His research interests are in 3D computer vision and machine learning. Specifically, his group (https://ivl.cs.brown.edu) focuses on visual understanding of 3D human physical interactions with applications ranging from robotics to mixed reality. He is a recipient of the NSF CAREER award, a Google Research Scholar award, and his work received the Eurographics Best Paper Honorable Mention. He spends part of his time as a visiting academic at Amazon Robotics and has previously spent time at Microsoft Research Redmond and Honda Research Institute.

01:35

Oral paper presentations01:35 PM - 03:00 PM

A presentation of the 3DCoMPaT++ challenge winners and their solutions, and of the VSIC challenge.

03:15

Coffee & Poster session03:15 PM - 04:00 PM

04:00

Invited Talk 604:00 PM - 04:35 PM

Qi Xiaojuan is an assistant professor in the Department of Electrical and Electronic Engineering at the University of Hong Kong. She received her Ph.D. from the Chinese University of Hong Kong and has worked and exchanged at the University of Toronto, Oxford University and Intel Visual Computing Group. She is committed to empowering machines with the ability to perceive, understand and reconstruct the visual world in the open world and pushing their deployments in embodied agents.

04:35

Invited Talk 704:35 PM - 05:10 PM

Prof. Dai's research focuses on attaining a 3D understanding of the world around us, capturing and constructing semantically-informed 3D models of real-world environments. This includes 3D reconstruction and semantic understanding from commodity RGB-D sensor data, leveraging generative 3D deep learning towards enabling understanding and interaction with 3D scenes for content creation and virtual or robotic agents. Prof. Dai received her PhD in computer science from Stanford in 2018 and her BSE in computer science from Princeton in 2013. Her research has been recognized through a ZDB Junior Research Group Award, an ACM SIGGRAPH Outstanding Doctoral Dissertation Honorable Mention, as well as a Stanford Graduate Fellowship. Since 2020, she has been a professor at TUM, leading the 3D AI Lab.

05:15

Panel discussion05:15 PM - 06:00 PM

A panel discussion on the future of compositional 3D vision, with the invited speakers and other experts in the field.

Challenges

3DCoMPaT Challenge

3DCoMPaT dataset++

πŸ” Challenge overview


The Grounded CoMPaT Recognition (GCR) is a compositional 3D vision task that aims to collectively recognize and ground compositions of materials on parts of 3D objects. This task is based on the 3DCoMPaT dataset++, a large-scale dataset composed of stylized 3D objects and associated 2D renderings.
We propose two variations of this task: GCR-Coarse and GCR-Fine, which are based on coarse-grained and fine-grained 3D segmentations of the 3DCoMPaT models.
We highly encourage participants of the challenge to enter and submit to both tracks of the challenge.

πŸ“Š Dataset


The 3DCoMPaT dataset++ for both challenge tracks is available through our download page.

πŸ“¨ Submission


Submission will be made through the eval.ai platform.

πŸ“œ Rules


Here are the rules for the challenge:

  • Submission Limit: Each participant is allowed to submit their solution a maximum of three times per day.
  • Data Usage: Participants are not permitted to use any data other than the 3DCoMPaT data for training their models.
  • Technical Report: Each participant must submit a technical report detailing their methods, which will be made public, in order to be eligible for any prizes or rewards.


πŸ† Awards


Total prize pool: 1500. Teams are encouraged to particpate to both challenge tracks.
Fine track:
  • 1st: 500
  • 2nd: 250
Coarse track:
  • 1st: 500
  • 2nd: 250

These prizes are designed to motivate participants to put their best effort into the challenge and to reward those who perform exceptionally well. The challenge organizers hope that these prizes will encourage a high level of participation and help to drive innovation in the field of 3D computer vision. It should be noted that eligibility for these prizes is contingent on participants adhering to the rules of the challenge. Therefore, participants must submit their solutions in accordance with the rules and provide a technical report detailing their methods to be considered for any prizes or rewards.

πŸ’¬ Q&A


If you encounter any technical issue related to the challenge, or if you're missing critical information, please open a ticket on our GitHub repository.


πŸŽ‰ 2023 Winning Solution


We share below the previous year's solution winner, and her winning solution repository below:


Cattalya's repository


Challenges

VSIC Challenge

The Visual Shape Inference Challenge (VSIC) will focus on the task of inferring structured shape representations from object-centered datasets, in the form of language. Inspired by recent advances in program synthesis for visual data (Ganeshan et. al., 2023, Jones et al. 2023) this challenge aims to explore how modern machine learning techniques can be applied to improve the inference of shape programs from visual information. Building on the principles of compactness and structure, participants will be provided with object-centered datasets and will be tasked with generating refined and parsimonious shape programs, drawing inspiration from techniques like Sparse Intermittent Rewrite Injection (SIRI). Please stay tuned for more information.

Paper submission

Papers

🎯 Featured papers


We are pleased to present the list of the papers presented for the C3DV workshop at CVPR 2024. These papers showcase the latest research and notable progress made in the field of compositional 3D vision.



We extend our sincere appreciation to all the participants for their valuable contributions to the workshop. The C3DV workshop owes its success to the passion and expertise of these researchers. We cordially invite everyone to explore this curated collection of accepted papers.


🦜 Topics


Besides the CoMPaT and VSIC challenges, the C3DV workshop also accepts papers in relation with compositional 3D vision. The workshop will include a poster and an oral session for related works. Topics of this workshop include but are not limited to:

  • Deep learning methods for compositional 3D vision
  • Self-supervised learning for compositional 3D vision
  • Visual relationship detection in 3D scenes
  • Zero-shot recognition/detection of compositional 3D visual concepts
  • Novel problems in 3D vision and compositionality
  • Text/composition to 3D generation
  • Text/composition-based editing of 3D scenes/objects
  • Language-guided 3D visual understanding (objects, relationships, ...)
  • Transfer learning for compositional 3D Vision
  • Multimodal pre-training for 3D understanding
  • Composition-based 3D object/scene search/retrieval
  • Compositional 3D vision aiding language problems
  • ...

The submitted 4-page abstracts will be peer-reviewed in CVPR format. Abstracts will be presented in the workshop poster session, and a portion of the accepted papers will be orally presented.



πŸ“¨ Submission


Paper submissions will be handled with CMT through the following link:

Microsoft CMT: C3DVCVPR2024

Please select the appropriate track (archival or non-archival) and check for the relevant timelines in the dates section.

Speakers

Invited Speakers

Angela Dai

Assistant Professor Technical University of Munich

Katerina Fragkiadaki

Assistant Professor Carnegie Mellon University

Srinath Sridhar

Assistant Professor Brown University

Minhyuk Sung

Assistant Professor KAIST

Andrea Vedaldi

Professor University of Oxford

Xiaojuan Qi

Assistant Professor University of Hong Kong

Jiajun Wu

Assistant Professor Stanford University

Organizers

Workshop Organizers

Habib Slim

Ph.D. Student KAUST

Abdulwahab Felemban

Ph.D. Student KAUST

Aymen Mir

Ph.D. Student University of TΓΌbingen

Wolfgang Heidrich

Professor KAUST

Peter Vajda

Researcher and Engineering Manager Meta AI

Natalia Neverova

Research Lead Meta AI

Mohamed Elhoseiny

Assistant Professor KAUST

Challenge Organizers

Aditya Ganeshan

Ph.D. Student Brown University

Kenny Jones

Ph.D. Student Brown University

Habib Slim

Ph.D. Student KAUST

Mahmoud Ahmed

Research Student KAUST

Xiang Li

Postdoctoral Researcher KAUST

Daniel Ritchie

Assistant Professor Brown University

Peter Wonka

Professor KAUST

Mohamed Elhoseiny

Assistant Professor KAUST

Dates

Timeline

Archival track (will appear in CVPR proceedings):
Event Date
Paper submission deadline March 24th
Notification to authors April 1st
Camera-ready deadline April 7th
Workshop date June 18th
Non-archival track:
Event Date
Paper submission deadline April 12th
Notification to authors May 15th
Camera-ready deadline May 31th
Workshop date June 18th
3DCoMPaT Challenge:
Event Date
Release of training/validation data Feb. 17th
Validation server online Feb. 20th
Test server online Feb. 20th
Submission deadline July 1st
Fact sheets/source code submission deadline July 31st
Winners announcement Aug. 16th

For any question or support, please reach @Habib.S.