Bài báo quốc tế
Kho tri thức
/
Bài báo quốc tế
/
A novel image retrieval framework: hybrid feature fusion with deep metric learning and uncertainty embedding https://link.springer.com/article/10.1007/s11760-025-04205-5
A novel image retrieval framework: hybrid feature fusion with deep metric learning and uncertainty embedding https://link.springer.com/article/10.1007/s11760-025-04205-5
Nguyễn Thị Hạnh
Content-based image retrieval (CBIR) has made notable progress thanks to deep learning methods, particularly convolutional neural networks (CNNs). These methods have demonstrated competitive performance in feature extraction and representation. Additionally, comparing the similarity between query images and images in the database based on semantic features combined with deep metric learning has contributed to improved image retrieval efficiency. However, current methods mainly focus on semantic features while not fully addressing the exploitation of uncertainty features, which arise from noise or semantic ambiguity in images. Moreover, CNNs primarily extract local features and may not fully capture the global relationships between features, which is a notable advantage of Vision Transformers (ViT). In this paper, we propose a novel image retrieval method named H-FUSE (Hybrid Feature fUSion with uncErtainty embedding). This method integrates both semantic and uncertainty features with deep metric learning while constructing a hybrid model combining CNN and ViT. This combination enables the extraction of both local and global features, effectively combining the advantages of both networks to enhance image retrieval performance. The proposed method was evaluated on two benchmark datasets, CIFAR-100 and CUB-200-2011, and demonstrated competitive performance across the test set. Specifically, on CIFAR-100, the method achieved a mean Average Precision (mAP) of 98.09% with Top-10 retrieval, while on CUB-200-2011, it achieved a mAP of 93.63%, suggesting its potential for practical use.
Xuất bản trên:
A novel image retrieval framework: hybrid feature fusion with deep metric learning and uncertainty embedding https://link.springer.com/article/10.1007/s11760-025-04205-5
Ngày đăng:
2025
Nhà xuất bản:
Signal, Image and Video Processing
Địa điểm:
Từ khoá:
Vision Transformers (ViT)
