expertis | Blog
expertis | Blog
  • expertis.co
  • Posts
MENU CLOSE back  

Principal Component Analysis PCA

Many times we face the need to analysis data collections that have a large number of variables. This leads to some problems when performing analysis trying to get information out from the data. When trying to use multivariate regression analysis to describe or predict we may not know which variables to use; even using different approaches as backward or forward selection could be costly. Not only that, it could bring another problem: multicollinearity. Those approaches do not care about the correlation of the selected x’s, making multiple regression difficult. Here is when PCA comes in handy.

Principal Component Analysis helps into identifying commonalities among the variables and grouping them into components that we should interpret hoping to find common sense in them. By doing these we may drop or surrogate variables not just for the statistical benefit of our regression model. We may actually select which variables to use and what to do with those left aside.

PCA is a mathematical approach rather that a statistical one. By using the directions (eigenvectors) and the spread in each direction (eigenvalues) we may rearrange the variables in order to gather the most variance possible in just few components. At least thats what we hope for. So we know that even when we transform the data the meaning and its relation remains.PCA is an interesting a powerful tool, could be use in different steps of data mining. For dimension reduction as it helps us to perform feature extraction; and for pattern discovery as we may use it to describe a phenomena.

Most important & difficult task – Explaining it

We have recently use PCA to describe the interrelation client and customers; we identify that the most challenging part in performing this analysis is making your client able to understand it. So try to tell a coherent story. Not and easy task when trying to explain how merging variables helps into identify patterns in a relationship. Trying to surrogate a single variable may be the best approach, you might loose predictive power, but explaining the phenomena gets easier. If you get an eureka moment in your customer then the adoption process could get closer.

November 27, 2015By jsotelo
Perform PCA?Next

Related posts

u2u-color-by-Modularity
Amazon product co-purchasing network
June 10, 2016
kmeans
PCA, SVD & AR in Python
May 30, 2016
Gephi2
Social Network Analysis
May 3, 2016
MLP
Machine Learning in Python
May 2, 2016
TSF Vol1 Exp1
Time Series Analysis Forecasting
March 20, 2016
PCA_RunRace
PCA – Testing for significance
November 29, 2015

One thought on “Principal Component Analysis PCA”

  1. December 20, 2015 at 7:39 pm
    jasa SEO, Backlink, Blogwalking murah

    Hello there, just became aware of your blog through Google, and found that it is truly informative. I am gonna watch out for brussels. I will be grateful if you continue this in future. Lots of people will be benefited from your writing. Cheers!|

Comments are closed.

Categories
  • Association Rule (2)
  • Classification (8)
  • Decision Trees (2)
  • Finance (2)
  • Gephi (1)
  • LDA (2)
  • Machine Learning (6)
  • Monte Carlo (4)
  • PCA (5)
  • Python (9)
  • R (6)
  • Random Forest (2)
  • Social Network (1)
  • SVD (2)
  • Thoughts (1)
  • TSA (3)
Julio Sotelo

julio.sotelo@expertis.co