sffnet

https://github.com/cqnu-zhanglab/sffnet

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (10.1%) to scientific vocabulary

Last synced: 6 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: CQNU-ZhangLab
License: agpl-3.0
Language: Python
Default Branch: main
Size: 4.38 MB

Statistics

Stars: 7
Watchers: 2
Forks: 0
Open Issues: 0
Releases: 0

Created 11 months ago · Last pushed 10 months ago

Metadata Files

Readme License Citation

SFFNet: Synergistic Feature Fusion Network With Dual-Domain Edge Enhancement for UAV Image Object Detection

SFFNet is a synergistic feature fusion network tailored for object detection in UAV images, addressing challenges like background noise and target scale imbalance. It combines multi-scale dynamic dual-domain coupling for effective edge extraction and a synergistic feature pyramid network to enhance geometric and semantic representations, achieving excellent performance across various scales.

Wenfeng Zhang¹, Jun Ni¹, Yue Meng², Xiaodong Pei², Wei Hu¹, Qibing Qin³, Lei Huang⁴

1. Chongqing Normal University 2. CETC Yizhihang (Chongqing) Technology Co., Ltd 3. Weifang University 4. Ocean University of China

📧 Corresponding author: wei.workstation@gmail.com

If our work helps your project, please give us a ⭐!

The relationship between the AP value and the number of parameters for different object detection algorithms on the VisDrone dataset. The points near the upper left corner of the figure indicate that the model can achieve higher accuracy while maintaining a lower number of parameters. Our algorithm (marked "Ours") outperforms others by achieving a higher AP value with fewer parameters.

Abstract 📄

Object detection in unmanned aerial vehicle (UAV) images remains a highly challenging task, primarily caused by the complexity of background noise and the imbalance of target scales. Traditional methods easily struggle to effectively separate objects from intricate backgrounds and fail to fully leverage the rich multi-scale information contained within images. To address these issues, we have developed a synergistic feature fusion network (SFFNet) with dual-domain edge enhancement specifically tailored for object detection in UAV images. Firstly, the multi-scale dynamic dual-domain coupling (MDDC) module is designed. This component introduces a dual-driven edge extraction architecture that operates in both the frequency and spatial domains, enabling effective decoupling of multi-scale object edges from background noise. Secondly, to further enhance the representation capability of the model's neck in terms of both geometric and semantic information, a synergistic feature pyramid network (SFPN) is proposed. SFPN leverages linear deformable convolutions to adaptively capture irregular object shapes and establishes long-range contextual associations around targets through the designed wide-area perception module (WPM). Moreover, to adapt to the various applications or resource-constrained scenarios, six detectors of different scales (N/S/M/B/L/X) are designed. Experiments on two challenging aerial datasets (VisDrone and UAVDT) demonstrate the outstanding performance of SFFNet-X, achieving 36.8 AP and 20.6 AP, respectively. The lightweight models (N/S) also maintain a balance between detection accuracy and parameter efficiency.

Performance 🏆 🌟

VisDrone

| Model | AP | AP₅₀ | AP₇₅ | AP_s | AP_m | AP_l |#Params |FLOPs| |:---------------|:---:|:--:|:--:|:--:|:--:|:--:|:--:|:--:| | SFFNet-N | 26.7 | 45.4 | 26.8 | 19.8 | 35.3 | 35.0 | 1.7M | 7.2G | | SFFNet-S | 31.6 | 52.8 | 32.3 | 25.3 | 40.8 | 35.7 | 6.3M | 24.0G | | SFFNet-M | 35.3 | 57.5 | 36.6 | 28.0 | 46.0 | 42.9 | 14.2M | 63.2G | | SFFNet-B | 35.9 | 58.2 | 37.3 | 28.6 | 46.6 | 46.2 | 19.7M | 105.6G | | SFFNet-L | 36.1 | 58.5 | 37.8 | 28.9 | 46.6 | 48.7 | 24.6M | 131.0G | | SFFNet-X | 36.8 | 59.3 | 38.5 | 29.6 | 47.4 | 45.4 | 38.5M | 203.7G |

UAVDT

| Model | AP | AP₅₀ | AP₇₅ | AP_s | AP_m | AP_l | |:----------------------|:---------------:|:----------------------------:|:----------------------------:|:----------------------------:|:----------------------------:|:----------------------------:| | ClusDet | 13.7 | 26.5 | 12.5 | 9.1 | 25.1 | 31.2 | | CenterNet | 13.2 | 26.7 | 11.8 | 7.8 | 26.6 | 13.9 | | DMNet | 14.7 | 24.6 | 16.3 | 9.3 | 26.2 | 35.2 | | GFL | 16.9 | 29.5 | 17.9 | - | - | - | | AMRNet | 18.2 | 30.4 | 19.8 | 10.3 | 31.3 | 33.5 | | GLSAN | 19.0 | 30.5 | 21.7 | - | - | - | | CEASC | 17.1 | 30.9 | 17.8 | - | - | - | | PRDET | 19.1 | 33.8 | 19.8 | - | - | - | | SCLNet | 20.0 | 33.1 | 22.3 | - | - | - | | YOLOv8-X | 18.1 | 31.1 | 18.8 | 13.0 | 28.4 | 32.4 | | YOLOv10-X | 19.5 | 32.3 | 21.0 | 13.4 | 28.9 | 31.5 | | YOLO11-X | 19.0 | 33.0 | 19.6 | 13.0 | 30.7 | 32.2 | | SFFNet-X | 20.6 | 34.4 | 21.9 | 14.3 | 31.5 | 33.3 |

Installation 🛠️

conda virtual environment is recommended. conda create -n SFFNet python=3.9 conda activate SFFNet pip install -r requirements.txt pip install -e .

Data Preparation 📦

To train on the VisDrone and UAVDT datasets, you need to organize them in the YOLO format. Follow the steps below to prepare your dataset: 1. Organize Images: Structure your dataset directories as follows: shell dataset/ ├── images/ │ ├── train/ │ │ ├── image1.jpg │ │ ├── image2.jpg │ │ └── ... │ ├── val/ │ │ ├── image1.jpg │ │ ├── image2.jpg │ │ └── ... └── labels/ ├── train/ │ ├── image1.txt │ ├── image2.txt │ └── ... └── val/ ├── image1.txt ├── image2.txt └── ... - images/train/: Contains all training images. - images/val/: Contains all validation images. - labels/train/: Contains all training labels. - labels/val/: Contains all validation labels.

Update Configuration Files:

Modify your VisDrone.yml or UAVDT.yml

```yaml path: # dataset root dir train: # train images (relative to 'path')
val: # val images (relative to 'path')
test: # test images (optional)

Your Dataset Classes

names: 0: pedestrian 1: people 2: bicycle 3: car 4: van 5: truck 6: tricycle 7: awning-tricycle 8: bus 9: motor

```

Quick validation ✅

python val.py

Quick training 🏋️‍♂️

python train.py

Acknowledgement 🙏

The code base is built with YOLOv10 and ultralytics.

Thanks for the great implementations!

Support 🌟

If our work helps your project, please give us a ⭐!

Owner

Login: CQNU-ZhangLab
Kind: user

Repositories: 1
Profile: https://github.com/CQNU-ZhangLab

Citation (CITATION.cff)

# This CITATION.cff file was generated with https://bit.ly/cffinit

cff-version: 1.2.0
title: Ultralytics YOLO
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Glenn
    family-names: Jocher
    affiliation: Ultralytics
    orcid: 'https://orcid.org/0000-0001-5950-6979'
  - given-names: Ayush
    family-names: Chaurasia
    affiliation: Ultralytics
    orcid: 'https://orcid.org/0000-0002-7603-6750'
  - family-names: Qiu
    given-names: Jing
    affiliation: Ultralytics
    orcid: 'https://orcid.org/0000-0003-3783-7069'
repository-code: 'https://github.com/ultralytics/ultralytics'
url: 'https://ultralytics.com'
license: AGPL-3.0
version: 8.0.0
date-released: '2023-01-10'

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science