Bài báo quốc tế
Kho tri thức
/
Bài báo quốc tế
/
When datasets deceive: Exposing overlap in smart contract vulnerability detection
When datasets deceive: Exposing overlap in smart contract vulnerability detection
Trần Tiến Công
Existing smart contract vulnerability datasets exhibit over 34% train–test overlap due to repeated function-level code, causing models to favor structural memorization over semantic generalization. To mitigate this issue, we construct a benchmark dataset with zero function overlap between the training and test partitions. Furthermore, we introduce GraphFusionDetect (GFD), a novel approach that integrates fine-tuned CodeBERT embeddings with Graph Neural Networks (GNNs) to capture inter-function dependencies. GFD achieves F1-scores of 80% for detecting reentrancy vulnerabilities and 89% for timestamp dependency vulnerabilities, surpassing baseline methods and enabling more robust and generalizable vulnerability detection.
Xuất bản trên:
When datasets deceive: Exposing overlap in smart contract vulnerability detection
Ngày đăng:
2026
Nhà xuất bản:
ICT Express
Địa điểm:
Từ khoá:
Smart contracts; Vulnerability detection; Graph Neural Networks; CodeBERT; Dataset curation
Bài báo liên quan
A Two-Stage Agent-based Framework for Network Attack Detection And Categorization in IoT
Nguyễn Huy TrungAnomaly-based intrusion detection leveraging optimized firewall log analysis: a real-time machine learning solution
Tran Cong HungAnomaly-based intrusion detection leveraging optimized firewall log analysis: a real-time machine learning solution
Tran Cong HungImproving the Web Crawling Accuracy with Machine Learning Based on Parsers Using Linguistic Structures
Nguyễn Minh Tuấn