To address these limitations, a variety of approaches have been proposed to enhance cross-modal interaction and reinforce temporal coherence 7. Some methods employ spatio-temporal attention mechanisms ...
We provide a dataset for object detection and tracking in aerial imagery, namely “M3OT”. M3OT is a multi-modality vehicle detection and tracking dataset acquired by two Unmanned Aerial Vehicles (UAVs) ...