= Dimensionality reduction = <br />
[[Dimensionality reduction]] is a method of transforming complex data in large dimensions <br />
into data with lesser dimensions ensuring that it conveys similar information.<br />
Let us consider [https://datamelt.org/examples/data/iris_org.arff IRIS dataset] <ref>Fisher,R.A. "The use of multiple measurements in taxonomic problems", Annual Eugenics, 7, Part II, 179-188 (1936); also in "Contributions to Mathematical Statistics" (John Wiley, NY, 1950).</ref>. <br />
The IRIS data set has 4 numerical attributes. Therefore, it is difficult for humans to visualize such data.<br />
Therefore, one can reduce the dimensionality of this dataset down to two. <br />
We will use [[Principal component analysis]] (PCA) which <br />
convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components.<br />
PCA needs the data samples to have a mean of ZERO, so we need a transform to ensue this property as well.<br />
Here is the code that uses the Java package <javadoc sc>jsat.datatransform.PCA</javadoc> to perform this transformation:<br />
The output image is shown here: <br />
