https://github.com/cwi-dis/acmmm2024-oral

ACMMM2024(Oral)-PCQA

https://github.com/cwi-dis/acmmm2024-oral

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.9%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

ACMMM2024(Oral)-PCQA

Basic Info
  • Host: GitHub
  • Owner: cwi-dis
  • Language: Python
  • Default Branch: main
  • Size: 8.97 MB
Statistics
  • Stars: 1
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created almost 2 years ago · Last pushed about 1 year ago
Metadata Files
Readme

README.md

ACMMM2024-Oral:M3-Unity

Official repo for 'Deciphering Perceptual Quality in Colored Point Cloud: Prioritizing Geometry or Texture Distortion?' ACM MM 2024.

Motivation

Point clouds represent one of the prevalent formats for 3D content. Distortions introduced at various stages in the point cloud processing pipeline affect the visual quality, altering their geometric composition, texture information, or both. Understanding and quantifying the impact of the distortion domain on visual quality is vital for driving rate optimization and guiding post-processing steps to improve the quality of experience. In this paper, we propose a multi-task guided multi-modality no reference metric (M3-Unity), which utilizes 4 types of modalities across attributes and dimensionalities to represent point clouds. An attention mechanism establishes inter/intra associations among 3D/2D patches, which can complement each other, yielding local and global features, to fit the highly nonlinear property of the human vision system. A multi-task decoder involving distortion-type classification selects the best association among 4 modalities, aiding the regression task and enabling the in-depth analysis of the interplay between geometrical and textural distortions. Furthermore, our framework design and attention strategy enable us to measure the impact of individual attributes and their combinations, providing insights into how these associations contribute particularly in relation to distortion type. Extensive experimental results on 4 datasets consistently outperform the state-of-the-art metrics by a large margin.

Framework

First, we preprocess the colored point cloud and extract multimodal features with 3D and 2D encoders, respectively. Second, we introduce the cross-attributes attentive fusion module, which captures the local and global associations at both the intra- and inter-modality perception. Last, we employ dual decoders to jointly learn both quality regression and distortion-type classification. The design for this framework is for further analysis, and we have separated the modality and associations to measure the individual contribution to the visual quality.

How to run the code

Environment Build

We train and test the code on the Ubuntu 18.04 platform with open3d and python=3.7. conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=11.3 -c pytorch The GPU is A100 with 48 GB memory, batchsize = 4.

Begin training

You can simply train the M3-Unity by referring to train.sh. For example, train M3-Unity on the SJTU-PCQA dataset with the following command:

python -u train.py \ --wandbkey "" \ # add your wandb key here otherwize the code will not run successfully --learning_rate 0.00005 \ --model M3_Unity \ --batch_size 4 \ --database SJTU \ --data_dir_texture_img path to sjtu_projections/ \ --data_dir_depth_img path to sjtu_depth_maps/ \ --data_dir_normal_img path to sjtu_normal_maps/ \ --data_dir_texture_pc path to sjtu_patch_texture/ \ --data_dir_position_pc path to sjtu_patch_position/ \ --data_dir_normal_pc path to sjtu_patch_normal/ \ --loss l2rank \ --num_epochs 100 \ --k_fold_num 9 \ --use_classificaiton 1 \ --use_local 1 \ --method_label with_dep_nor

The training data of the projections and patches, will be accessed soon.

Example Visualization

Anlysis

The most interesting part is by this framework design, we can gain some insights about how the association of texture and geometry impacts the final quality, instead of running an extensive subjective study for point cloud quality assessment, we can estimate the relationship by M3-Unity. We have 4 modalities, 6 associations, and we have 6 equations. However, if we want to derive the relationship of the 4 modalities, either we remove two redundant equations or we have more variables.

(𝑇2𝐷 • 𝐺2𝐷)= A

(𝑇3𝐷 • 𝐺3𝐷)= B

(𝑇2𝐷 • 𝑇3𝐷)= C

(𝑇2𝐷 • 𝐺3𝐷)= D

(𝐺2𝐷 • 𝑇3𝐷)= E

(𝐺2𝐷 • 𝐺3𝐷)= F

In this paper, we re-define that any combination of 2D and 3D can compose a texture, the same for geometry, which is:

(𝑇2𝐷,𝑇3𝐷)=𝑇′

(𝐺2𝐷,𝐺3𝐷 )=𝐺′

Hence we have the following results:

Future Work

However, each assumption needs to be verified by subjective studies[1]. The simplest assumption is that we just remove the least important associations, and then we solve 4 variables.

Bibtex


If you find our code useful please cite the paper
@inproceedings{zhou2024deciphering, title={Deciphering Perceptual Quality in Colored Point Cloud: Prioritizing Geometry or Texture Distortion?}, author={Zhou, Xuemei and Viola, Irene and Chen, Yunlu and Pei, Jiahuan and Cesar, Pablo}, booktitle={ACM Multimedia 2024} } If you encounter any issues with the code or training dataset, please contact xuemei.zhou@cwi.nl

Rerference

[1]Lazzarotto, D., Testolina, M., & Ebrahimi, T. (2024). Subjective performance evaluation of bitrate allocation strategies for MPEG and JPEG Pleno point cloud compression. EURASIP Journal on Image and Video Processing , 2024 (1).

Owner

  • Name: cwi-dis
  • Login: cwi-dis
  • Kind: organization
  • Location: Amsterdam, the Netherlands

CWI Distributed and Interactive Systems Group

GitHub Events

Total
  • Watch event: 1
  • Push event: 19
Last Year
  • Watch event: 1
  • Push event: 19