Bài báo quốc tế
Kho tri thức
/
Bài báo quốc tế
/
VIT-RBTFIR: A Hybrid Vision Transformer Method For Fast Image Retrieval
VIT-RBTFIR: A Hybrid Vision Transformer Method For Fast Image Retrieval
Châu Văn Vân
We present ViT-RBTFIR, a hybrid content-based image retrieval (CBIR) framework that couples Vision Transformers (ViT) for semantic feature extraction with a Randomized Binary Tree Forest (RBTF) for sublinear indexing in Hamming space. ViT embeddings are converted into compact binary codes by a lightweight hashing head, enabling fast candidate generation via RBTF. Across four benchmarks, the method delivers competitive Top-K accuracy while markedly reducing query latency compared with brute-force search. Complementary comparisons against established approximate nearest-neighbor baselines (FAISS and HNSW) indicate favorable speed-accuracy-memory trade-offs, especially at larger scales, supporting the practicality of ViT-RBTFIR for real-time retrieval.
Xuất bản trên:
VIT-RBTFIR: A Hybrid Vision Transformer Method For Fast Image Retrieval
Ngày đăng:
2026
Nhà xuất bản:
Địa điểm:
Từ khoá:
Vision Transformer (ViT) , Randomized Binary Tree Forest (RBTF) , Content-Based Image Retrieval (CBIR) , Feature Hashing,Real-Time Image Search
