https://handwiki.org/wiki/index.php?title=DMelt:Statistics/6_Dimensionality_reduction&feed=atom&action=historyDMelt:Statistics/6 Dimensionality reduction - Revision history2022-01-22T15:03:43ZRevision history for this page on the wikiMediaWiki 1.35.1https://handwiki.org/wiki/index.php?title=DMelt:Statistics/6_Dimensionality_reduction&diff=319&oldid=previmported>Jworkorg at 17:18, 14 February 20212021-02-14T17:18:52Z<p></p>
<p><b>New page</b></p><div>{{sidebar box|[[DMelt:Start|Table of contents]]}}<br />
<br />
<br />
= Dimensionality reduction = <br />
<br />
[[Dimensionality reduction]] is a method of transforming complex data in large dimensions <br />
into data with lesser dimensions ensuring that it conveys similar information.<br />
<br />
Let us consider [https://datamelt.org/examples/data/iris_org.arff IRIS dataset] <ref>Fisher,R.A. "The use of multiple measurements in taxonomic problems", Annual Eugenics, 7, Part II, 179-188 (1936); also in "Contributions to Mathematical Statistics" (John Wiley, NY, 1950).</ref>. <br />
The IRIS data set has 4 numerical attributes. Therefore, it is difficult for humans to visualize such data.<br />
Therefore, one can reduce the dimensionality of this dataset down to two. <br />
We will use [[Principal component analysis]] (PCA) which <br />
convert a set of observations of possibly correlated variables into a set of values of linearly uncorrelated variables called principal components.<br />
PCA needs the data samples to have a mean of ZERO, so we need a transform to ensue this property as well.<br />
<br />
Here is the code that uses the Java package <javadoc sc>jsat.datatransform.PCA</javadoc> to perform this transformation:<br />
<br />
<br />
<jcode lang="python"><br />
dmelt 76974242.py<br />
</jcode><br />
<br />
The output image is shown here: <br />
<br />
<jput><br />
image 76974242<br />
</jput><br />
<br />
<br />
[[Category: Statistics]]<br />
[[Category:Data mining]]</div>imported>Jworkorg