expertis | Blog
expertis | Blog
  • expertis.co
  • Posts
MENU CLOSE back  

Perform Principal Component Analysis?

Principal Component Analysis PCA, is a good technique for dimensionality reduction. What variables are relevant and its meaning could be extracted from the analysis.
What PCA does is to capture most of the variance in a dataset by grouping variables into a single component. From each component we would try to determine if there is any sense, and then create new parameters or select some of them. The first steps into PCA should include two tests for significance: homogeneity of variances and correlation test.
As we are trying to extract significant variables( to predict, classify or describe a phenomena) from a set of parameters of which we may have different distributions we perform the first test. To that end we use The Bartlett Sphericity Test. We would test the hypothesis that the variances for each variable is that they do not have significant different variances.

To perform the Bartlett Test in R we use the function called bartlett.test(numericData) form the library stats. A positive result for PCA is when a low p-value is shown; meaning that the variances in our variables are not significantly different, even with different distributions for each parameter.
In the example shown to the right, we have our green light to perform Principal Component Analysis. What is the p-value cutoff? Well as usual it depends on the significance level (alpha) you are looking for. For this given example even a 99% confidence (alpha 0.01) would make the cut.

– Take a brief pause to say that our intention is to prepare posts in a colloquial practical way, rather than a technical one. Thus we considere a methodology, such as PCA, as a tool just as R is a tool to perform PCA. We are trying to show how we may approach an analysis; showing how a tool is or not helpful giving the output provided by the tools –

November 28, 2015By jsotelo
PCA – Testing for significancePrincipal Component Analysis PCA

Related posts

u2u-color-by-Modularity
Amazon product co-purchasing network
June 10, 2016
kmeans
PCA, SVD & AR in Python
May 30, 2016
Gephi2
Social Network Analysis
May 3, 2016
MLP
Machine Learning in Python
May 2, 2016
TSF Vol1 Exp1
Time Series Analysis Forecasting
March 20, 2016
PCA_RunRace
PCA – Testing for significance
November 29, 2015

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

clear formSubmit

Categories
  • Association Rule (2)
  • Classification (8)
  • Decision Trees (2)
  • Finance (2)
  • Gephi (1)
  • LDA (2)
  • Machine Learning (6)
  • Monte Carlo (4)
  • PCA (5)
  • Python (9)
  • R (6)
  • Random Forest (2)
  • Social Network (1)
  • SVD (2)
  • Thoughts (1)
  • TSA (3)
Julio Sotelo

julio.sotelo@expertis.co