Science Score: 44.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (10.1%) to scientific vocabulary
Repository
Basic Info
- Host: GitHub
- Owner: CQNU-ZhangLab
- License: agpl-3.0
- Language: Python
- Default Branch: main
- Size: 4.38 MB
Statistics
- Stars: 7
- Watchers: 2
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
SFFNet: Synergistic Feature Fusion Network With Dual-Domain Edge Enhancement for UAV Image Object Detection
SFFNet is a synergistic feature fusion network tailored for object detection in UAV images, addressing challenges like background noise and target scale imbalance. It combines multi-scale dynamic dual-domain coupling for effective edge extraction and a synergistic feature pyramid network to enhance geometric and semantic representations, achieving excellent performance across various scales.
1. Chongqing Normal University 2. CETC Yizhihang (Chongqing) Technology Co., Ltd 3. Weifang University 4. Ocean University of China
📧 Corresponding author: wei.workstation@gmail.com
If our work helps your project, please give us a ⭐!
The relationship between the AP value and the number of parameters for different object detection algorithms on the VisDrone dataset. The points near the upper left corner of the figure indicate that the model can achieve higher accuracy while maintaining a lower number of parameters. Our algorithm (marked "Ours") outperforms others by achieving a higher AP value with fewer parameters.
Abstract 📄
Object detection in unmanned aerial vehicle (UAV) images remains a highly challenging task, primarily caused by the complexity of background noise and the imbalance of target scales. Traditional methods easily struggle to effectively separate objects from intricate backgrounds and fail to fully leverage the rich multi-scale information contained within images. To address these issues, we have developed a synergistic feature fusion network (SFFNet) with dual-domain edge enhancement specifically tailored for object detection in UAV images. Firstly, the multi-scale dynamic dual-domain coupling (MDDC) module is designed. This component introduces a dual-driven edge extraction architecture that operates in both the frequency and spatial domains, enabling effective decoupling of multi-scale object edges from background noise. Secondly, to further enhance the representation capability of the model's neck in terms of both geometric and semantic information, a synergistic feature pyramid network (SFPN) is proposed. SFPN leverages linear deformable convolutions to adaptively capture irregular object shapes and establishes long-range contextual associations around targets through the designed wide-area perception module (WPM). Moreover, to adapt to the various applications or resource-constrained scenarios, six detectors of different scales (N/S/M/B/L/X) are designed. Experiments on two challenging aerial datasets (VisDrone and UAVDT) demonstrate the outstanding performance of SFFNet-X, achieving 36.8 AP and 20.6 AP, respectively. The lightweight models (N/S) also maintain a balance between detection accuracy and parameter efficiency.Performance 🏆 🌟
- VisDrone
| Model | AP | AP50 | AP75 | APs | APm | APl |#Params |FLOPs| |:---------------|:---:|:--:|:--:|:--:|:--:|:--:|:--:|:--:| | SFFNet-N | 26.7 | 45.4 | 26.8 | 19.8 | 35.3 | 35.0 | 1.7M | 7.2G | | SFFNet-S | 31.6 | 52.8 | 32.3 | 25.3 | 40.8 | 35.7 | 6.3M | 24.0G | | SFFNet-M | 35.3 | 57.5 | 36.6 | 28.0 | 46.0 | 42.9 | 14.2M | 63.2G | | SFFNet-B | 35.9 | 58.2 | 37.3 | 28.6 | 46.6 | 46.2 | 19.7M | 105.6G | | SFFNet-L | 36.1 | 58.5 | 37.8 | 28.9 | 46.6 | 48.7 | 24.6M | 131.0G | | SFFNet-X | 36.8 | 59.3 | 38.5 | 29.6 | 47.4 | 45.4 | 38.5M | 203.7G |
- UAVDT
| Model | AP | AP50 | AP75 | APs | APm | APl | |:----------------------|:---------------:|:----------------------------:|:----------------------------:|:----------------------------:|:----------------------------:|:----------------------------:| | ClusDet | 13.7 | 26.5 | 12.5 | 9.1 | 25.1 | 31.2 | | CenterNet | 13.2 | 26.7 | 11.8 | 7.8 | 26.6 | 13.9 | | DMNet | 14.7 | 24.6 | 16.3 | 9.3 | 26.2 | 35.2 | | GFL | 16.9 | 29.5 | 17.9 | - | - | - | | AMRNet | 18.2 | 30.4 | 19.8 | 10.3 | 31.3 | 33.5 | | GLSAN | 19.0 | 30.5 | 21.7 | - | - | - | | CEASC | 17.1 | 30.9 | 17.8 | - | - | - | | PRDET | 19.1 | 33.8 | 19.8 | - | - | - | | SCLNet | 20.0 | 33.1 | 22.3 | - | - | - | | YOLOv8-X | 18.1 | 31.1 | 18.8 | 13.0 | 28.4 | 32.4 | | YOLOv10-X | 19.5 | 32.3 | 21.0 | 13.4 | 28.9 | 31.5 | | YOLO11-X | 19.0 | 33.0 | 19.6 | 13.0 | 30.7 | 32.2 | | SFFNet-X | 20.6 | 34.4 | 21.9 | 14.3 | 31.5 | 33.3 |
Installation 🛠️
conda virtual environment is recommended.
conda create -n SFFNet python=3.9
conda activate SFFNet
pip install -r requirements.txt
pip install -e .
Data Preparation 📦
To train on the VisDrone and UAVDT datasets, you need to organize them in the YOLO format. Follow the steps below to prepare your dataset:
1. Organize Images:
Structure your dataset directories as follows:
shell
dataset/
├── images/
│ ├── train/
│ │ ├── image1.jpg
│ │ ├── image2.jpg
│ │ └── ...
│ ├── val/
│ │ ├── image1.jpg
│ │ ├── image2.jpg
│ │ └── ...
└── labels/
├── train/
│ ├── image1.txt
│ ├── image2.txt
│ └── ...
└── val/
├── image1.txt
├── image2.txt
└── ...
- images/train/: Contains all training images.
- images/val/: Contains all validation images.
- labels/train/: Contains all training labels.
- labels/val/: Contains all validation labels.
Update Configuration Files:
Modify your VisDrone.yml or UAVDT.yml
```yaml path: # dataset root dir train: # train images (relative to 'path')
val: # val images (relative to 'path')
test: # test images (optional)Your Dataset Classes
names: 0: pedestrian 1: people 2: bicycle 3: car 4: van 5: truck 6: tricycle 7: awning-tricycle 8: bus 9: motor
```
Quick validation ✅
python val.py
Quick training 🏋️♂️
python train.py
Acknowledgement 🙏
The code base is built with YOLOv10 and ultralytics.
Thanks for the great implementations!
Support 🌟
If our work helps your project, please give us a ⭐!
Owner
- Login: CQNU-ZhangLab
- Kind: user
- Repositories: 1
- Profile: https://github.com/CQNU-ZhangLab
Citation (CITATION.cff)
# This CITATION.cff file was generated with https://bit.ly/cffinit
cff-version: 1.2.0
title: Ultralytics YOLO
message: >-
If you use this software, please cite it using the
metadata from this file.
type: software
authors:
- given-names: Glenn
family-names: Jocher
affiliation: Ultralytics
orcid: 'https://orcid.org/0000-0001-5950-6979'
- given-names: Ayush
family-names: Chaurasia
affiliation: Ultralytics
orcid: 'https://orcid.org/0000-0002-7603-6750'
- family-names: Qiu
given-names: Jing
affiliation: Ultralytics
orcid: 'https://orcid.org/0000-0003-3783-7069'
repository-code: 'https://github.com/ultralytics/ultralytics'
url: 'https://ultralytics.com'
license: AGPL-3.0
version: 8.0.0
date-released: '2023-01-10'
GitHub Events
Total
- Watch event: 4
- Push event: 17
- Create event: 2
Last Year
- Watch event: 4
- Push event: 17
- Create event: 2