COMPARISON OF OUTLIER DETECTION TECHNIQUES IN NON-STATIONARY TIME SERIES DATA

Sampson Twumasi-Ankrah,, Simon Kojo Appiah,, Doris Arthur,, Wilhemina Adoma Pels, Adoma Pels,, Jonathan Kwaku Afriyie,, Danielson Nartey,

Abstract


This study examined the performance of six outlier detection techniques using a non-stationary time series dataset.
Two key issues were of interest. Scenario one was the method that could correctly detect the number of outliers
introduced into the dataset whiles scenario two was to find the technique that would over detect the number of outliers
introduced into the dataset, when a dataset contains only extreme maxima values, extreme minima values or both. Air
passenger dataset was used with different outliers or extreme values ranging from 1 to 10 and 40. The six outlier
detection techniques used in this study were Mahalanobis distance, depth-based, robust kernel-based outlier factor
(RKOF), generalized dispersion, Kth nearest neighbors distance (KNND), and principal component (PC) methods.
When detecting extreme maxima, the Mahalanobis and the principal component methods performed better in correctly
detecting outliers in the dataset. Also, the Mahalanobis method could identify more outliers than the others, making it
the "best" method for the extreme minima category. The kth nearest neighbor distance method was the "best" method
for not over-detecting the number of outliers for extreme minima. However, the Mahalanobis distance and the principal
component methods were the "best" performed methods for not over-detecting the number of outliers for the extreme
maxima category. Therefore, the Mahalanobis outlier detection technique is recommended for detecting outlier in nonstationary
time series data.


Keywords


Outlier, time series, mahalanobis method, depth-based method, generalized dispersion method.

Full Text:

PDF

Refbacks

  • There are currently no refbacks.