cca-gnnclust
Cross Camera Data Association using Supervised Clustering GNN
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (9.7%) to scientific vocabulary
Repository
Cross Camera Data Association using Supervised Clustering GNN
Basic Info
- Host: GitHub
- Owner: djordjened92
- Language: Python
- Default Branch: main
- Size: 6.92 MB
Statistics
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
Cross Camera Data Association using Supervised Clustering GNN
Introduction
This project is an attempt to apply Hi-LANDER (one of methods of Graph Neural Network for Supervised Graph Clustering ) to the cross-camera instance matching. The full report can be found here.
The specific task in this project is connecting persons across the different views in different environments:
Method
Graph Creation
The input structure for this method is a directed graph $G = (V,E)$, where $V = {vi \mid i \in [1, N]}$ represents the set of nodes denoting all pedestrian bounding boxes. Each node is depicted by embedding $hi$ initialized with appropriate, normalized feature $fi$, forming node embeddings set $H = {hi \mid i \in [1, N]}$. For each node of camera $ci$, we find the one closest neighbor from each other camera view $cj, j \neq i$ (green arrows in the cell $(l_1, a)$ in Table 1 between C1 and C2, the same way for each other cameras pair - black arrows), unlike Hi-LANDER which applies pure kNN over the whole corpus of nodes. This neighbor selection per camera is related to the setup where the pedestrian can appear mostly once in each view.
Graph Encoding
Using $hi$ as the input embedding of the node $vi$, GCN encodes it as a new node embedding $h_i'$ in the following way:
$$hi' = \phi(hi, \sum{vj \in N{vi}} w{ji}\psi(hj))$$
where $\phi$ and $\psi$ are MLPs, $w{ji}$ is a trainable vector. $N{vi} = {vj, (vj, vi) \in E}$ is the neighborhood of node $v_i$, defined with the set of incoming edges.
GCN encoder can be applied multiple times on the same graph, so the effect of the number of message passing steps is also explored in this work.
Linkage Prediction and Node Density
After the Graph Encoding step, resulting node features $H'$ are used to predict the linkage between nodes. The edge $(vi, vj)$ connectivity is predicted by applying MLP classifier $\theta$. The input is a vector created from concatenated node features ($hi', hj'$) and nodes' ground plane positions
$(\hat{xi}, \hat{yi})$, $(\hat{xj}, \hat{yj})$.
The original work considers the concatenation of node features only.
The output is a sigmoid activation which estimates the probability that two connected nodes have the same label.
math
\hat{r}_{ij} = P(r_i = r_j) = \sigma(\theta([h_i', \hat{x_i}, \hat{y_i}, h_j', \hat{x_j}, \hat{y_j}]^T))
A node density $di$ is the value that depicts the weighted partition of neighbors which have the same label as the node $vi$. Its estimation is defined as:
math
\hat{d_i} = \frac{1}{k}\sum_{j=1}^{k}\hat{e}_{ij}a_{ij}
where $a{i,j} = \langle hi, hj \rangle$ is the similarity of nodes' embeddings, and $\hat{e}{ij}$ is the edge coefficient defined as:
math
\hat{e}_{ij} = P(r_i = r_j) - P(r_i \neq r_j).
Graph Decoding
After an estimation of the graph attributes (node density and edge coefficient) using the GNN encoder, it is possible to find connected components of the graph in the next two steps:
Edge filtering: We initialize a new edge set $E' = \emptyset$. The subset of outgoing edges for each node $v_i$ are created as
math
\varepsilon(i) = \{j \mid (v_i, v_j) \in E \wedge \hat{d}_i \leq \hat{d}_j \wedge \hat{r}_{ij} \geq p_{\tau}\}
where
math
\hat{r}_{ij}=P(r_i=r_j)
and $p{\tau}$ is the edge connection threshold. Each node with non-empty $\varepsiloni$ contributes to the set $E'$ with one edge selected as
math
j=argmax(\hat{e}_{ik}), k \in \varepsilon(i)
The edge $(vi, vj)$ is added to the $E'$. With the condition $\hat{d}i \leq \hat{d}j$ authors of Hi-LANDER introduced an inductive bias to discourage connection to nodes on the border of clusters.
Peak nodes: The set of edges $E'$ defines new, refined graph $G'$ (cell $(l1, b)$ in Table 1) on the same set of nodes. The peak nodes are those without outgoing edges. They have a maximum density in the neighborhood. The way $G'$ is created implies a separation of the graph in the set of connected components $Q = {qi \mid i \in [1, Z]}$. Consequently, each connected component has one peak node distinguished by the highest density in the connected component (cell $(l_1, c)$ in Table 1).
Hierarchical Design
The whole pipeline explained in previous sections can be repeated on the final set of peak nodes as a new input (row $l_2$ in Table 1). Multi-level approach demands an aggregation of the features for each connected component from the level $l$, which is replaced with a single node on the level $l + 1$. The node embeddings of the next level is defined as a concatenation of the peak node features and the mean node features:
math
h^{(l + 1)}_i = [\tilde{h}^{(l)}_{q_{i}}, \bar{h}^{(l)}_{q_{i}}].
Lables back-propagation
Once the algorithm finishes and we obtained the final set of peak nodes, their labels can be propagated back to the all nodes in the belonging connected components. For the example in the Table 1 the final parition is given as different node colors on the following figure:
Requirements
In the docker directory run:
bash
docker build --rm --no-cache -t sgc-cca:v_1 -f Dockerfile .
Citation
If you use this software in your work, please cite it using:
@misc{nedeljković2024crosscameradataassociationgnn,
title={Cross-Camera Data Association via GNN for Supervised Graph Clustering},
author={Đorđe Nedeljković},
year={2024},
eprint={2410.00643},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2410.00643},
}
Owner
- Name: Djordje Nedeljkovic
- Login: djordjened92
- Kind: user
- Location: Belgrade
- Repositories: 2
- Profile: https://github.com/djordjened92
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Nedeljković" given-names: "Đorđe" orcid: "https://orcid.org/0009-0008-3791-6983" title: "Cross-Camera Data Association via GNN for Supervised Graph Clustering" url: "https://github.com/djordjened92/cca-gnnclust"
GitHub Events
Total
Last Year
Dependencies
- nvcr.io/nvidia/cuda 11.7.0-runtime-ubuntu20.04 build