You are currently viewing a beta version of our website. If you spot anything unusual, kindly let us know.

Preprint
Article

SO-RTDETR for Small Object Detection in Aerial Images

Altmetrics

Downloads

47

Views

17

Comments

0

Submitted:

24 October 2024

Posted:

25 October 2024

You are already at the latest version

Alerts
Abstract
In aerial image object detection, small targets present significant challenges due to limited pixel information, complex backgrounds, and sensitivity to bounding box perturbations. To tackle these issues, we propose SO-RTDETR for small object detection. The model introduces a Cross-Scale Feature Fusion with S2 (S2-CCFF) module, a Parallelized Patch-Aware attention (PPA) module, and the Normalized Wasserstein Distance (NWD) loss function, leading to significant performance improvements. Specifically, the S2-CCFF module enhances small object information by incorporating an additional S2 layer, while SPDConv downsampling maintains key details and reduces computational cost. The CSPOK-Fusion mechanism integrates global, local, and large branch features, capturing multi-scale representations and effectively mitigating interference from complex backgrounds and occlusions, thereby enhancing the spatial representation of features across scales. The PPA module, embedded in the Backbone network, leverages multi-level feature fusion and attention mechanisms to retain and strengthen small object features, addressing the issue of information loss. The NWD loss function, by focusing on the relative positioning and shape differences of bounding boxes, increases robustness to minor perturbations, enhancing detection accuracy. Experimental results on the VisDrone and NWPU VHR-10 aerial datasets demonstrate that our approach outperforms state-of-the-art detectors.
Keywords: 
Subject: Computer Science and Mathematics  -   Artificial Intelligence and Machine Learning
Copyright: This open access article is published under a Creative Commons CC BY 4.0 license, which permit the free download, distribution, and reuse, provided that the author and preprint are cited in any reuse.
Prerpints.org logo

Preprints.org is a free preprint server supported by MDPI in Basel, Switzerland.

Subscribe

© 2024 MDPI (Basel, Switzerland) unless otherwise stated