Software:scikit-multiflow

From HandWiki
Short description: Machine learning library for data streams in Python

scikit-mutliflow
Scikit-multiflow-logo.png
Original author(s)Jacob Montiel, Jesse Read, Albert Bifet, Talel Abdessalem
Developer(s)The scikit-mutliflow development team and the open research community
Initial releaseJanuary 2018; 6 years ago (2018-01)
Stable release
0.5.3 / 17 June 2020; 4 years ago (2020-06-17)[1][2]
Repositoryhttps://github.com/scikit-multiflow/scikit-multiflow
Written inPython, Cython
Operating systemLinux, macOS, Windows
TypeLibrary for machine learning
LicenseBSD 3-Clause license
Websitescikit-multiflow.github.io

scikit-mutliflow (also known as skmultiflow) is a free and open source software machine learning library for multi-output/multi-label and stream data written in Python.[3]

Overview

scikit-multiflow allows to easily design and run experiments and to extend existing stream learning algorithms.[3] It features a collection of classification, regression, concept drift detection and anomaly detection algorithms. It also includes a set of data stream generators and evaluators. scikit-multiflow is designed to interoperate with Python's numerical and scientific libraries NumPy and SciPy and is compatible with Jupyter Notebooks.

Implementation

The scikit-multiflow library is implemented under the open research principles and is currently distributed under the BSD 3-Clause license. scikit-multiflow is mainly written in Python, and some core elements are written in Cython for performance. scikit-multiflow integrates with other Python libraries such as Matplotlib for plotting, scikit-learn for incremental learning methods[4] compatible with the stream learning setting, Pandas for data manipulation, Numpy and SciPy.

Components

The scikit-multiflow is composed of the following sub-packages:

  • anomaly_detection: anomaly detection methods.
  • data: data stream methods including methods for batch-to-stream conversion and generators.
  • drift_detection: methods for concept drift detection.
  • evaluation: evaluation methods for stream learning.
  • lazy: methods in which generalisation of the training data is delayed until a query is received, i.e., neighbours-based methods such as kNN.
  • meta: meta learning (also known as ensemble) methods.
  • neural_networks: methods based on neural networks.
  • prototype: prototype-based learning methods.
  • rules: rule-based learning methods.
  • transform: perform data transformations.
  • trees: tree-based methods, e.g. Hoeffding Trees which are a type of Decision Tree for data streams.

History

scikit-multiflow started as a collaboration between researchers at Télécom Paris (Institut Polytechnique de Paris[5]) and École Polytechnique. Development is currently carried by the University of Waikato, Télécom Paris, École Polytechnique and the open research community.

See also

References

External links