Clustering Noisy Time Series

Authors

  • Anastasiia Yevhenivna Tkachenko
  • Liudmyla Olehivna Kyrychenko
  • Tamara Anatoliivna Radyvylova

DOI:

https://doi.org/10.34185/1562-9945-3-122-2019-15

Keywords:

кластеризация, временной ряд, функция расстояния, метод k-средних, метод DBSCAN

Abstract

One of the urgent tasks of machine learning is the problem of clustering objects. Clustering time series is used as an independent research technique, as well as part of more complex data mining methods, such as rule detection, classification, anomaly detection, etc.
A comparative analysis of clustering noisy time series is carried out. The clustering sample contained time series of various types, among which there were atypical objects. Clustering was performed by k-means and DBSCAN methods using various distance functions for time series.
A numerical experiment was conducted to investigate the application of the k-means and DBSCAN methods to model time series with additive white noise. The sample on which clustering was carried out consisted of m time series of various types: harmonic realizations, parabolic realizations, and “bursts”.
The work was carried out clustering noisy time series of various types.
DBSCAN and k-means methods with different distance functions were used. The best results were shown by the DBSCAN method with the Euclidean metric and the CID function.
Analysis of the results of the clustering of time series allows determining the key differences between the methods: if you can determine the number of clusters and you do not need to separate atypical time series, the k-means method shows fairly good results; if there is no information on the number of clusters and there is a problem of isolating non-typical rows, it is advisable to use the DBSCAN method.

References

Aghabozorgi S., Shirkhorshidi A.S., Wah, T.J.: Time-series clustering. A Decade Review Information systems 53, 16-38 (2015).

Aggarwal C., Reddy C.: Data Clustering: Algorithms and Applications. CRC Press (2013).

Liao T.W. Clustering of time series data – a survey. Pattern Recognition, 38 (11), 1857-1874 (2005).

Rani S., Sikka G.: Recent Techniques of Clustering of Time Series Data: A Survey. International Journal of Computer Applications 52 (15), 1-9 (2012). doi: 10.5120/8282-1278.

Grabusts P., Borisov A.: Clustering methodology for time series mining (2009). Scientific Journal of Riga Technical University 40, 81-86 (2009).

Barreto G., Aguayo, L.: Time Series Clustering for Anomaly Detection Using Competitive Neural Networks. In: Proceeding WSOM '09 Proceedings of the 7th International Workshop on Advances in Self-Organizing Maps, St. Augustine, FL, USA, 28-36 (2009).

Nascimento E.S., Tavares O.L., Souza A.F.: A Cluster-based Algorithm for Anomaly Detection in Time Series Using Mahalanobis. In: ICAI'2015 International Conference on Artificial Intelligence 2015, Las Vegas, USA 622-628 (2015).

Downloads

Published

2019-10-10