Bài báo quốc tế
Towards Universal Segmentation for Log Parsing
Lê Văn Hoàng
Log parsing is a crucial step in log analysis, as it transforms unstructured log messages into structured data required by various downstream analysis tasks. The sheer volume of log data generated by modern software systems motivates the development of numerous log parsing techniques in the literature. However, existing log parsers still suffer from unsatisfactory accuracy, which may significantly affect the follow-up analysis such as log-based anomaly detection. We have identified two main limitations that hinder the effectiveness of existing log parsing methods: (1) under-segmentation: most log parsers leverage a fixed, predefined set of delimiters to separate a log message into a set of tokens, which may fail to split log messages correctly due to the heterogeneity of logging formats; (2) over-segmentation: using too many delimiters may lead to the over-segmentation issue, which fragments meaningful units in log messages and makes it difficult to accurately identify templates and parameters. To address these limitations, we propose SCLog, a novel syntax- and contextual-aware segmentation approach for log parsing. SCLog leverages a comprehensive set of syntax-based heuristics to segment log messages into coarse-grained tokens. To further tokenize log messages into fine-grained tokens, SCLog mines the structural patterns of tokens based on their surrounding contexts to identify the optimal delimiters for each token dynamically. We evaluate SCLog on widely-used, large-scale Loghub-2.0 datasets. The results demonstrate that SCLog significantly outperforms state-of-the-art log parsers in terms of parsing accuracy and robustness across diverse datasets.
Xuất bản trên:
Towards Universal Segmentation for Log Parsing
Ngày đăng:
2026
Nhà xuất bản:
Địa điểm:
Từ khoá:
Log Parsing, Segmentation, Syntactic Analysis, Structural Patterns
Bài báo liên quan
Intelligent UAV Positioning and Fair Power Allocation via DRL in Cloud-Impaired HAP-to-UAV FSO/RF Systems
Nguyễn Quốc HuyDigital transformation solution implementation risk in logistics and supply chain industry
Trần Thanh HươngRDD-SPA: An efficient visual recognition algorithm for robot-assisted rapeseed pest control systems
Quanshu Song3D Dynamic Radio Map Prediction Using Vision Transformers for Low-Altitude Wireless Networks
Nguyen Duc Minh Quang