COMPARISON OF OUTLIER DETECTION TECHNIQUESIN NON-STATIONARY TIME SERIESDATA

Sampson Twumasi-Ankrah,, Simon Kojo Appiah,, Doris Arthur,, Wilhemina Adoma Pels,, Jonathan Kwaku Afriyie,, Danielson Nartey,

Abstract


This study examined the performance of six outlier detection techniques using a non-stationary time series dataset. Two key issues were of interest. Scenario onewas the method that couldcorrectly detectthe number of outliers introduced into the dataset whilesscenario two was to find the techniquethat wouldover detectthe number of outliers introduced into the dataset, when a dataset contains only extreme maxima values, extreme minima values or both. Air passenger dataset was used with differentoutliers or extreme values ranging from 1 to 10 and 40. The six outlier detection techniques used in this study wereMahalanobis distance, depth-based, robust kernel-based outlier factor (RKOF), generalized dispersion, Kth nearest neighbors distance (KNND), and principal component (PC) methods. When detecting extreme maxima, the Mahalanobis and the principal component methods performed better in correctly detecting outliers in the dataset. Also,the Mahalanobis method could identify more outliers than the others, making it the "best"method for the extreme minimacategory. The kth nearest neighbor distance method was the "best" method for not over-detecting the number of outliers for extreme minima. However, the Mahalanobis distance and the principal component methods were the "best" performed methods for not over-detecting the number of outliers for the extreme maxima category.Therefore, the Mahalanobis outlier detection technique is recommended for detecting outlier in non-stationary time series data.


Keywords


Outlier, time series, mahalanobismethod,depth-based method,generalized dispersion method.

Full Text:

PDF

Refbacks

  • There are currently no refbacks.