Cổng tri thức PTIT

Trang chủ

Giới thiệu

AI Cộng đồng

Kho tri thức

Tin tức

Liên hệ

Bài báo quốc tế

Kho tri thức

Bài báo quốc tế

Temporal Degradation in Machine Learning-Based Malware Detection: A Multi-Dataset, Multi-Year Empirical Study

Huỳnh Trọng Thưa

Machine-learning malware detectors achieve near-perfect deployment accuracy yet silently degrade as threats evolve. We present a multi-dataset temporal study of this concept drift on 1.68 million Portable-Executable samples from EMBER 2017, EMBER 2018, and BODMAS (2019–2020), unified in the EMBER v2 feature space and analyzed with three classifier families (LightGBM, Random Forest, MLP) across nine experimental dimensions: in-era baselines, cross-era transfer, monthly drift tracking, incremental retraining, family-level false-negative decomposition, feature-group sensitivity, cumulative Area-Under-Time (AUT) analysis, drift-triggered retraining (ADWIN, DDM), and active-learning sample selection, with 10-seed statistical validation. Six findings emerge. (1) Forward degradation is asymmetric: under a strict appeared-year split, training on 2017 data loses 8.47 percentage points (pp) F1 on 2018 data (LightGBM, 10 seeds), whereas the reverse direction shows no degradation. (2) Unseen malware families dominate failures, with false-negative rates up to 23.92% and same-month ratios exceeding 30× relative to known families in the strongest case. (3) Cross-era robustness is feature-group dependent: SectionInfo and ImportsInfo dominate transfer (+0.85 and +0.37 pp respectively when retained), while HeaderFileInfo and StringExtractor act as temporal artifacts—zeroing them improves cross-era F1 by 0.67 and 0.47 pp respectively. (4) Incremental retraining with only 1% newly labeled data gains +0.56 pp cumulative AUT over a static baseline. (5) ADWIN/DDM-triggered retraining matches that AUT within 0.07–0.13 pp on LightGBM while issuing ∼33–35% fewer retrains, exposing a label-budget vs. accuracy trade-off. (6) Uncertainty sampling delivers a +0.76 pp AUT improvement over random sampling at identical labeling cost (p = 0.0020, Wilcoxon). Together the results form a five-way mitigation ladder—static, fixed 1%/month, ADWIN-triggered, DDM-triggered, and uncertainty-sampled—that practitioners can position along their labeling-budget and AUT requirements.

Xuất bản trên:

Temporal Degradation in Machine Learning-Based Malware Detection: A Multi-Dataset, Multi-Year Empirical Study

Ngày đăng:

2026

DOI:

https://ieeexplore.ieee.org/document/11576052

Nhà xuất bản:

IEEE Access

Địa điểm:

Từ khoá:

Concept drift , malware detection , machine learning , temporal analysis , PE malware , intrusion detection systems

Bài báo liên quan

EDIL-SegRayDP: Training-Free Iris Segmentation via Segmentation-First Ray-Wise Dynamic Programming

Huỳnh Trọng Thưa

Polar Topology Transformers With Anatomical Skip Connections for Efficient Iris Segmentation

Huỳnh Trọng Thưa

Efficient Iris Recognition via Polar Representation and Radial Stripe Attention

Huỳnh Trọng Thưa

From Public Benchmarks to a Low-Resource Target Domain: A Comparative Study of Wood Surface Defect Detection

Nguyễn Trọng Khánh

WiT: Wood Species Identification via a Hybrid CNN–Transformer With Query-Guided Cross-Attention

Ma Công Thành

OnDeploying Bilinear Neural Network Method to Various Solutions of (3+1)-dimensional Potential Yu-Toda-Sasa-Fukuyama Equation

Nguyễn Minh Tuấn

Adaptive Cloud–Edge Coordination for Real-Time Phishing URL Detection With Distributed Caching and ONNX-Based Inference

Đàm Minh Lịnh