Cổng tri thức PTIT

Trang chủ

Giới thiệu

AI Cộng đồng

Kho tri thức

Tin tức

Liên hệ

Bài báo quốc tế

Kho tri thức

Bài báo quốc tế

Toward Robust Malware Detection: A Survey of Datasets, Techniques, and Practical Challenges

Huỳnh Trọng Thưa

The increasing sophistication of malware has diminished the effectiveness of traditional signature-based detection. While Machine Learning (ML), Deep Learning (DL), and Large Language Models (LLMs) have improved malware classification, real-world systems continue to struggle with evasion attacks, temporal drift, and class imbalance. This study reviews the advancements in robust malware detection, focusing on benchmark datasets, detection methods, and operational constraints. Public datasets - EMBER2018, SOREL-20M, MalDICT, MOTIF, and EMBER2024 - are assessed for scale, label quality, and reproducibility. This paper contributes: (i) a Robust Malware Evaluation Protocol (RMEP) for consistent benchmarking under low False-Positive Rates (FPR) (≤ 0.1%) with temporal splits, and (ii) a Dataset-Task-Robustness (DTR) matrix for systematic comparison, offering practical guidance for reproducible malware-detection research. Future efforts should focus on broader multi-platform benchmark coverage, explicit analysis of robustness–accuracy trade-offs, interpretable language-assisted detection pipelines, and privacy-preserving collaborative learning frameworks.

Xuất bản trên:

Toward Robust Malware Detection: A Survey of Datasets, Techniques, and Practical Challenges

Ngày đăng:

2026

DOI:

https://etasr.com/index.php/ETASR/article/view/17500

Nhà xuất bản:

Engineering, Technology and Applied Science Research (ETASR)

Địa điểm:

Từ khoá:

malware detection, robustness, benchmark datasets, Machine Learning (ML), Deep Learning (DL), Large Language Models (LLMs)

Bài báo liên quan

From Public Benchmarks to a Low-Resource Target Domain: A Comparative Study of Wood Surface Defect Detection

Nguyễn Trọng Khánh

WiT: Wood Species Identification via a Hybrid CNN–Transformer With Query-Guided Cross-Attention

Ma Công Thành

OnDeploying Bilinear Neural Network Method to Various Solutions of (3+1)-dimensional Potential Yu-Toda-Sasa-Fukuyama Equation

Nguyễn Minh Tuấn

Adaptive Cloud–Edge Coordination for Real-Time Phishing URL Detection With Distributed Caching and ONNX-Based Inference

Đàm Minh Lịnh

A novel entropy Autoencoder-Synchronized Hashing Semi-supervised network for robust Android malware identification

Nguyễn Huy Trung

DistilBERT for Efficient and Accurate Email Phishing Detection: A Benchmark Against Machine and Deep Learning Models

Đàm Minh Lịnh

Transfer Learning with Particle Swarm Optimization for Durian LeafDisease Image Classiﬁcation

Trần Nguyễn Phi Hùng