Multicollinearity

In my last post I was having trouble with getting the intercept to be significant. For this post I decided to use different significant predictors that will work with testing for multicollinearity. I decided to use ratings, number of reviews, and exclusivity.

The model equation would look like this:

Loves = -775.38 + 1451.61*Ratings +35.55*Number of Reviews + 4654.49*Exclusiveness

All three of these predictors were tested and are significant (3 Stars). Now it is time to test for multicollinearity. Multicollinearity represents the correlation between the predictors. In my data set we will be testing correlation between ratings, number of reviews, exclusiveness. I will do this by creating a correlation matrix using the “cor” function:

The code above produces our results shown below:

As you can see that the diagonal of this matrix is all 1’s which means that the variables have perfect correlation with themselves.

The information that we are actually looking at is the three correlations between the significant predictors:

  • Number of reviews and ratings have a correlation of .081 (Not big enough to raise any concern)
  • Ratings and exclusivity have a correlation of – .002 (Exclusivity is a binary variable and does not have traditional correlation characteristics we can ignore this correlation)
  • Exclusivity and the number of reviews have a correlation of .004 (Again, exclusivity is a binary variable so we can ignore this variable)

Since none of my predictors have large enough correlation to raise any concern we can say that there is “no multicollinearity” in my model. Thank you for following along!!

Leave a comment

Design a site like this with WordPress.com
Get started