Len Feremans

Mandaatassistent

Pattern-Based Anomaly Detection in Mixed-Type Time Series

Abstract: The present-day accessibility of technology enables easy logging of both sensor values and event logs over extended periods. In this context, detecting abnormal segments in time series data has become an important data mining task. Existing work on anomaly detection focuses either on continuous time series or discrete event logs and not on the combination. However, in many practical applications, the patterns extracted from the event log can reveal contextual and operational conditions of a device that must be taken into account when predicting anomalies in the continuous time series. This paper proposes an anomaly detection method that can handle mixed-type time series. The method leverages frequent pattern mining techniques to construct an embedding of mixed-type time series on which an isolation forest is trained. Experiments on several real-world univariate and multivariate time series, as well as a synthetic mixed-type time series, show that our anomaly detection algorithm outperforms state-of-the-art anomaly detection techniques such as MatrixProfile, Pav, Mifpod and Fpof

By Len Feremans, Vincent Vercruyssen*, Boris Cule, Wannes Meert*, and Bart Goethals.

In Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Data (ECML PKDD 2019), 2019 Springer.

* Department of Computer Science, KU Leuven, Belgium

Efficiently mining cohesion-based patterns and rules in event sequences

Abstract: Discovering patterns in long event sequences is an important data mining task. Traditionally, research focused on frequency-based quality measures that allow algorithms to use the anti-monotonicity property to prune the search space and efficiently discover the most frequent patterns. In this work, we step away from such measures, and evaluate patterns using cohesion — a measure of how close to each other the items making up the pattern appear in the sequence on average. We tackle the fact that cohesion is not an anti-monotonic measure by developing an upper bound on cohesion in order to prune the search space. By doing so, we are able to efficiently unearth rare, but strongly cohesive, patterns that existing methods often fail to discover. Furthermore, having found the occurrences of cohesive itemsets in the input sequence, we use them to discover the representative sequential patterns and the dominant partially ordered episodes, without going through the computationally expensive candidate generation procedures typically associated with sequential pattern and episode mining. Experiments show that our method efficiently discovers important patterns that existing state-of-the-art methods fail to discover.

By Boris Cule, Len Feremans, and Bart Goethals.

In Data Mining and Knowledge Discovery Volume 33(4), pp.1125-1182, 2019 Springer.

About

Member of Adrem Data Lab research group, University of Antwerp, Department of Mathematics and Computer Science
Office G.323, Middelheimlaan 1, 2020 Antwerpen, Belgium
Tel: (+32) 3 265 38 73, E-mail: len.feremans(a)uantwerpen.be

Research Topic

My research is centered around mining patterns in sequential data based on novel definitions of interestingness, anomaly detection in mixed-type time series, and extreme multi-label classification.

Projects

Member of HYMOP project. HYMOP is a strategic basic research project (SBO) with the academic objective to push the scientific state-of-the-art. Within HYMOP I focus on fleet-based data mining applications and developing new algorithms. A nice application is to apply pattern mining to a fleet of wind turbines.

Assistent

2018-2019:*

2015-2016:

* Also assisting in 2014-2015, 2015-2016, 2016-2017