Cổng tri thức PTIT

Bài báo quốc tế

Kho tri thức

/

/

Toward Robust Malware Detection: A Survey of Datasets, Techniques, and Practical Challenges

Toward Robust Malware Detection: A Survey of Datasets, Techniques, and Practical Challenges

Huỳnh Trọng Thưa

The increasing sophistication of malware has diminished the effectiveness of traditional signature-based detection. While Machine Learning (ML), Deep Learning (DL), and Large Language Models (LLMs) have improved malware classification, real-world systems continue to struggle with evasion attacks, temporal drift, and class imbalance. This study reviews the advancements in robust malware detection, focusing on benchmark datasets, detection methods, and operational constraints. Public datasets - EMBER2018, SOREL-20M, MalDICT, MOTIF, and EMBER2024 - are assessed for scale, label quality, and reproducibility. This paper contributes: (i) a Robust Malware Evaluation Protocol (RMEP) for consistent benchmarking under low False-Positive Rates (FPR) (≤ 0.1%) with temporal splits, and (ii) a Dataset-Task-Robustness (DTR) matrix for systematic comparison, offering practical guidance for reproducible malware-detection research. Future efforts should focus on broader multi-platform benchmark coverage, explicit analysis of robustness–accuracy trade-offs, interpretable language-assisted detection pipelines, and privacy-preserving collaborative learning frameworks.

Xuất bản trên:

Toward Robust Malware Detection: A Survey of Datasets, Techniques, and Practical Challenges


Nhà xuất bản:

Engineering, Technology and Applied Science Research (ETASR)

Địa điểm:


Từ khoá:

malware detection, robustness, benchmark datasets, Machine Learning (ML), Deep Learning (DL), Large Language Models (LLMs)