thumbnail

Topic

Technologies and technical equipment for agriculture and food industry

Volume

Volume 78 / No. 1 / 2026

Pages : 1456-1468

Metrics

Volume viewed 0 times

Volume downloaded 0 times

RAFE-DETR: AN RT-DETR-BASED ALGORITHM FOR MULTI-BEHAVIOR DETECTION IN GROUP-HOUSED PIGS

RAFE-DETR:一种基于RT-DETR的群养猪多行为检测算法

DOI : https://doi.org/10.35633/inmateh-78-113

Authors

Lihong RONG

College of Mechanical and Electrical Engineering, Qingdao Agricultural University, Qingdao / China

Fang SUN

College of Mechanical and Electrical Engineering, Qingdao Agricultural University, Qingdao / China

Xiusong LI

Qingdao Big Herdsman Machinery Co., Ltd., Qingdao / China

Weilong ZHANG

College of Mechanical and Electrical Engineering, Qingdao Agricultural University, Qingdao / China

Chengguo HAN

College of Mechanical and Electrical Engineering, Qingdao Agricultural University, Qingdao / China

(*) Zhimin TONG

College of Mechanical and Electrical Engineering, Qingdao Agricultural University, Qingdao / China

(*) Corresponding authors:

leicahit@qau.edu.cn |

Zhimin TONG

Abstract

Accurate detection of multiple behaviors in group-housed pigs was important for precision livestock farming and intelligent farm management. This study proposed RAFE-DETR, an improved detector based on RT-DETR, for recognizing standing, lying, feeding, drinking, and fighting in overhead surveillance images. RFAConv was embedded into RepViT blocks to construct the RFA-RepViT backbone for stronger local feature extraction. The original intra-scale interaction module was replaced with BiFormer to improve contextual modeling. The neck was redesigned with ASF-CSA to enhance adaptive multi-scale fusion, and Focaler-Shape-IoU was introduced to refine box regression. Experiments were conducted on a five-class dataset reconstructed from public surveillance videos. The proposed model achieved 93.9% precision, 92.7% recall, and 94.2% mean average precision at an intersection-over-union threshold of 0.5. Compared with RT-DETR-L, these values increased by 1.4, 2.8, and 3.0 percentage points, respectively. At the same time, the number of parameters decreased from 32.0 M to 21.9 M, and GFLOPs decreased from 103.5 to 77.0. Supplementary experiments on a second public dataset supported the robustness of the method. Deployment on Jetson Orin NX Super reached 13.8 and 19.1 frames per second under PyTorch and TensorRT, respectively, indicating good edge-deployment potential.

Abstract in Chinese

群养猪多行为检测是精准养殖和智能化猪场管理中的重要环节。针对俯视监控图像中的站立、躺卧、采食、饮水和打斗五类行为识别任务,本研究提出了一种基于 RT-DETR 的改进检测模型 RAFE-DETR。通过在 RepViT 模块中嵌入 RFAConv,构建了 RFA-RepViT 主干网络,以增强局部特征提取能力。原始同尺度特征交互模块被替换为 BiFormer,以提升上下文建模能力。Neck 结构被重构为 ASF-CSA,以增强自适应多尺度特征融合,并引入 Focaler-Shape-IoU 以优化边界框回归质量。实验基于由公开监控视频重建的五类行为数据集开展。结果表明,RAFE-DETR 在主数据集上的精确率、召回率和 mAP@0.5 分别达到 93.9%、92.7% 和 94.2%。与 RT-DETR-L 相比,这三项指标分别提高了 1.4、2.8 和 3.0 个百分点。同时,参数量由 32.0 M 降至 21.9 M,GFLOPs 由 103.5 降至 77.0。第二公共数据集上的补充实验进一步验证了该方法的稳定性。部署结果表明,模型在 Jetson Orin NX Super 平台上采用 PyTorch 和 TensorRT 后端时,推理速度分别达到 13.8 FPS 和 19.1 FPS,表明该模型具有较好的边缘部署潜力。


Indexed in

Clarivate Analytics.
 Emerging Sources Citation Index
Scopus/Elsevier
Google Scholar
Crossref
Road