DMelt:DataAnalysis/Anomaly Detection

From HandWiki
Member


Anomaly detection

Anomaly_detection is (also outlier detection ) is the identification of rare items, events or observations which raise suspicions by differing significantly from the majority of the data. DatMelt includes EGADS library that can be called from any Java script. The library includes a collection of time-series and anomaly detection models that are applicable to a wide-range of use cases.

To run EGADS library [1], you need a configuration file (which defines what to do to your data), and the actual time-series data.

Here is a simple example that downloads a configuration file and a CVS data. Then it runs an algorithm give in the configuration file. One can also customize the type of output:

# Anomaly detection problem for time series
# You need at least configuration file defining what to do, and data file 
# with time series
"""
This example uses EGADS (Extensible Generic Anomaly Detection System) is an open-source Java package to automatically detect anomalies in large scale time-series data.
"""

from java.util import Properties
from java.io import FileInputStream
from com.yahoo.egads.utilities import FileInputProcessor
from jhplot import *

print "Get a sample config file and CSV data file"
conf="sample_config.ini"
data="sample_input.csv"
fhttp="https://datamelt.org/examples/data/egads/"
print Web.get(fhttp+conf)
print Web.get(fhttp+data)

p = Properties()
iss = FileInputStream(conf)
p.load(iss)

ip = FileInputProcessor(data)
ip.processInput(p)

The code produces a GUI image with the original time series data and with anomaly detected features.

  1. Generic and Scalable Framework for Automated Time-series Anomaly Detection by Nikolay Laptev, Saeed Amizadeh, Ian Flint , KDD 2015