Bài báo quốc tế
Kho tri thức
/
Bài báo quốc tế
/
Real-time phishing uniform resource locator detection based on hybrid embedding transformer and retraining-free inferencing
Real-time phishing uniform resource locator detection based on hybrid embedding transformer and retraining-free inferencing
Đàm Minh Lịnh
Phishing attacks that evade traditional detection mechanisms by exploiting deceptive uniform resource locators (URLs) remain a significant cybersecurity threat. This study proposes an adaptive phishing URL detection framework that integrates Levenshtein distance-based string similarity, a hybrid embedding transformer (HET) encoder-based server-side verification mechanism, and a dynamically updated local blacklist. First, a rapid local lookup is executed to identify known phishing URLs. If the input URL is absent from the blacklist, the Levenshtein distance algorithm detects subtle character-level variations, identifying typosquatting and obfuscation effectively. For ambiguous cases, the HET-based module uses a lightweight post-hoc inference method that classifies URL embeddings via k-nearest neighbor voting based on Euclidean similarity in the latent space, thereby avoiding retraining and enabling real-time adaptation to
emerging phishing threats. Confirmed phishing URLs are added iteratively to the local repository to improve detection continuously, enhancing future classification accuracy. Experimental evaluation on a large-scale dataset comprising 235,795 URLs revealed that the proposed method outperforms state-of-the-art approaches, achieving a detection accuracy of 99.8 %, with a falsepositive rate of 0.441 % and false-negative rate of 0.0617 %. Additionally, real-time validation using a Chrome browser extension confirmed rapid processing, with an average processing time of 4.43–6.84 ms per URL on a dataset comprising 5,000 URLs. These results highlight the efficiency of the proposed framework in real-world cybersecurity contexts, enabling high detection accuracy, fast response times, and adaptability to evolving phishing strategies, and underscore the importance of proactive threat intelligence and real-time phishing mitigation in developing scalable, high-performance security infrastructures.
Xuất bản trên:
Real-time phishing uniform resource locator detection based on hybrid embedding transformer and retraining-free inferencing
Ngày đăng:
2026
Nhà xuất bản:
Computers and Electrical Engineering
Địa điểm:
Từ khoá:
Artificial intelligence, Cybersecurity, Hybrid embedding Transformer, Levenshtein distance, Retraining-free inferencing, URL data classification
