Pelayo Arbués

Search

❯

Literature Notes

❯

The Motivation Behind Using KernelPCA Over PCA for Dimensionality Reduction

Feb 15, 2024, 2 min read

#articles
#literature-note

rw-book-cover

Metadata

Author: Avi Chawla
Full Title: The Motivation Behind Using KernelPCA Over PCA for Dimensionality Reduction
URL: https://www.blog.dailydoseofds.com/p/the-motivation-behind-using-kernelpca

Highlights

During dimensionality reduction, principal component analysis (PCA) tries to find a low-dimensional linear subspace that the given data conforms to. (View Highlight)
It’s pretty clear from the above visual that there is a linear subspace along which the data could be represented while retaining maximum data variance. This is shown below: (View Highlight)
(View Highlight)
(View Highlight)
The above curve is a continuous non-linear and low-dimensional subspace that we could represent our data given along. (View Highlight)
The problem is that PCA cannot determine this subspace because the data points are non-aligned along a straight line. (View Highlight)
Nonetheless, if we consider the above non-linear data, don’t you think there’s still some intuition telling us that this dataset can be reduced to one dimension if we can capture this non-linear curve. (View Highlight)
Project the data to another high-dimensional space using a kernel function, where the data becomes linearly representable. Sklearn provides a KernelPCA wrapper, supporting many popularly used kernel functions. (View Highlight)
Apply the standard PCA algorithm to the transformed data. (View Highlight)
As shown below, even though the data is non-linear, PCA still produces a linear subspace for projection: However, KernelPCA produces a non-linear subspace: (View Highlight)
The catch is the run time. Please note that the run time of PCA is already cubically related to the number of dimensions. KernelPCA involves the kernel trick, which is quadratically related to the number of data points (n). (View Highlight)

Graph View

Metadata
Highlights

Recent Notes

AI Enhanced Knowledge Management
May 13, 2024
Sucessful Model
May 06, 2024
The Rise of the Dataset Engineer
Apr 25, 2024

See 87 more →

Now Reading

A Guide to Structured Generation Using Constrained Decoding
May 16, 2024

See 656 more →

Created with Quartz, © 2024

Twitter
Linkedin
Mastodon
Unsplash
GitHub
RSS