Dataset#

The dataset is separated into fine and coarse semantic levels, both for 2D and 3D data. To download the data, please accept our license agreement by filling this Google form.

Completing the form will give you access to our S3 bucket bulk download script.

File sizes#

We provide here an indicative summary of file sizes for each data modality.

2D.

split

size (per composition)

train (WDS)

\(\approx\) 5 GB

test (WDS)

\(\approx\) 840 MB

valid (WDS)

\(\approx\) 400 MB

For a total of a 100 compositions, the dataset thus reaches a full size of more than 500GB: be careful to adjust to your local storage when constraints when selecting a version to download, in what follows.

Hint

We have now released the test set labels, as our CVPR 2023 challenge is over. You can access the test data loader just as any other split, using the split="test" parameter. For more information, please take a look at the documentation on the 2D dataloaders.

3D_PC.

split

n_comp

size

train (PC, HDF5)

10

4.3 GB

test (PC, HDF5)

10

588 MB

valid (PC, HDF5)

10

358 MB

Note that for the test set, only pointclouds are available.

3D_ZIP.

split

n_comp

size

train, valid _(MESHES, ZIP)

1000

15.8 GB

For 3D, downloading the zip-packaged meshes and sampling pointclouds yourself can be advantageous in terms of storage, beyond 40 compositions.

The compressed 3D shapes package gives access to all 1000 styles per model and semantic level, but requires online sampling of pointclouds through the trimesh API.

Hint

We highly suggest pre-extracting pointclouds, rather than sampling pointclouds from meshes online which can be a bottleneck in your training pipeline. For more information on 3D loaders, please refer to 3D dataloaders.

Renderings#

Download#

To download the first 10 compositions for the 2D data for the validation split at the coarse semantic level, place yourself in the scriptโ€™s directory and run the following command:

python3 download.py --n-comp compat_010 --modality 2D --split valid --semantic-level coarse --outdir ./3DCoMPaT_2D/

The shards for each composition of the validation split will then be sequentially downloaded in the specified output directory.

Hint

The folder structure starting from the 3DCoMPaT directory should not be altered for compatibility with the provided 2D loaders. For more information on how to use the 2D tar shards, please refer to 2D dataloaders.

Pointclouds/Meshes#

Pointclouds. The 3D data is provided both in zip-packaged form (for access to stylized shape meshes), and as pre-extracted RGB pointclouds in HDF5 format.

To download the first 10 compositions for the 3D RGB pointclouds for the valid split at the coarse semantic level, place yourself in the script directory and run the following command:

python3 download.py --n-comp compat_010 --modality 3D_PC --split valid --semantic-level coarse --outdir ./3DCoMPaT_3D_PC/

Download#

Compressed 3D (Meshes). To download the full compressed 3D data (1000 compositions, \(\approx\) 15.8 GB), run:

python3 download.py --modality 3D_ZIP --outdir ./3DCoMPaT_3D_ZIP/

Custom use#

For a more advanced usage of the script, or for batched/parallel downloads, please refer to the manpage below.

usage: download.py [-h] --outdir OUTDIR --modality {2D,3D,3D_PC} [--split {train,test,valid}] [--semantic-level {fine,coarse}] [--start-comp START_COMP] [--end-comp END_COMP] [--n-comp {compat_010,compat_050,compat_100}]

Download the 3DCoMPaT++ dataset.

options:
  -h, --help            show this help message and exit
  --outdir OUTDIR       Output folder in which the tars should be downloaded
  --modality {2D,3D,3D_PC}
                        Data type to download. Use "2D" for WDS-packaged 2D data, "3D" for zip-packaged 3D shapes, "3D_PC" for 3D pointclouds.
  --split {train,test,valid}
                        Split to download. Invalid for "3D".
  --semantic-level {fine,coarse}
                        Semantic level of the 2D images/3D pointclouds. Invalid for "3D".
  --start-comp START_COMP
                        Start index of the range of compositions to download.
  --end-comp END_COMP   End index of the range of compositions to download. (included)
  --n-comp {compat_010,compat_050,compat_100}
                        Data type to download. Use "compat_010" or "compat_050", compat_100 to download 10/50/100 rendered compositions.

You can use --start-comp and --end-comp to split 2D downloads across a few processes.

Checksums#

In the zipped download script, we provide checksums for the 2D WebDataset tar shards, 3D HDF5 files and 3D packaged shapes.

download/hashes/2D_sums.md5
download/hashes/3D_ZIP_sum.md5
download/hashes/3D_HDF5_sums.md5

To re-compute them, place yourself in the folder containing either the shards or HDF5 files, and run:

find . -type f -exec md5sum {} + | sort -k2 > sums.md5

You can then compare your outputs with the provided files.

Challenge#

The 3DCoMPaT challenge is evaluated on 10ย compositions of the test set. Pointclouds for the test set are pre-extracted and downloadable via the following command:

Coarse-grained track.

python3 download.py --n-comp compat_010 --modality 3D_PC --split test --semantic-level coarse --outdir ./3DCoMPaT_3D_PC/

Fine-grained track.

python3 download.py --n-comp compat_010 --modality 3D_PC --split test --semantic-level fine --outdir ./3DCoMPaT_3D_PC/

Matching 2D data can be obtained using the commands provided in the above sections.