Cổng tri thức PTIT

Trang chủ

Giới thiệu

AI Cộng đồng

Kho tri thức

Tin tức

Liên hệ

Bài báo quốc tế

Kho tri thức

Bài báo quốc tế

An Object Detection Framework Based on Relationship Between Objects in an Open Vocabulary Using Owl-VIT And RelTransformer

Nguyễn Thị Nguyệt

Object detection has been widely adopted across various applications, but traditional methods mainly provide isolated object locations without capturing their relationships. To address this limitation, we propose a system that detects both objects and their relationships within images based on natural language queries. The approach integrates OWL-ViT for open-vocabulary object detection, RelTR for relationship inference, and Large Language Models (LLMs) for query processing and language understanding. Unlike fixed-vocabulary models, OWL-ViT enables detection from free-text descriptions, improving generalization and supporting flexible user queries. Experimental results show that the proposed framework can localize objects and infer their relations with a recognition accuracy of 27%, demonstrating its potential for intelligent systems such as query-driven surveillance and human–machine interaction. This work is not merely a combination of existing models, but a deliberate integration designed to address the novel challenge of detecting object relationships from natural language queries.

Xuất bản trên:

An Object Detection Framework Based on Relationship Between Objects in an Open Vocabulary Using Owl-VIT And RelTransformer

Ngày đăng:

DOI:

Nhà xuất bản:

Địa điểm:

Từ khoá:

Object Detection, Relationship Detection, Open Vocabulary, OWLViT, Rel-Transformer.

Bài báo liên quan

FA-Net: A Dual-Branch Attention Architecture for Extracting Fine-Grained Anatomical Features of Wood

Ma Công Thành

Optimizing Beamforming for Cell-Free MIMO ISAC Systems with Low-Resolution ADCs

Bùi Văn Kiên

TinyCDAE: Lightweight Convolutional Denoising Autoencoders for Real-Time Image Denoising on Resource-Constrained IoT Devices

Nguyễn Trọng Huân

Estimation of External Government Debt Thresholds: The Case of Vietnam

Đặng Thị Huyền Anh

Dynamic Customer Experience, Satisfaction, and Word-of-Mouth in Telecom-IT Sector

Nguyễn Quang Hưng

A Comparative Study of Transformer and Convolutional Neural Network Architectures

Nguyễn Trung Hiếu

A Survey on Methods of Applying Transformers to Non-NLP Applications

Nguyễn Trung Hiếu