Interpreting the Coefficients

In my last post I had stated that I will be interpreting the coefficients of my linear regression model. As a refresher, below is the summary output of R:

Since the intercept and the slope are significant my model equation looks like this:

Loves = 16570.79 – 165.99*Difference in Price

We can first start by interpreting the intercept. When the difference between value price and actual price is 0, the amount of loves the product has is 16,571 on average. Essentially what this is saying that the products that are not discounted have 16,571 loves on average. This could be considered a normal average of loves because there are a lot of products at Sephora that are not discounted and have a high amount of loves. We can take a look at this using a histogram.

This histogram goes to show that many of the products at Sephora do not have 16,571 loves. This histogram actually shows us that 16,571 loves is on the lower end.

Let’s do another check but on Difference in Prices…

This histogram proves the statement I had mentioned before where there are lot of products at Sephora that are not discounted.

For our slope, -165.99 indicates that as for every dollar that difference in price increases, the amount of loves will decrease by 166 on average. This is a fairly large decrease in the amount of loves. This slope essentially means that whenever a product is discounted it is going to have a lower amount of loves on the product. This is economically significant because maybe this means Sephora should really pick and choose which products to offer a lower price on and do so off of ratings or something like that.

Now let’s take a look at a different variable.

Now we are going to investigate the amount of loves (Y) with exclusivity (X). Exclusivity is if Sephora is the only company who sells this product. The observations for exclusivity is 0 = NO and 1 = YES.

Below is the code I used to find out statistical significance:

Below is the output summary in R:

Here we can see that both the intercept and the slope are significantly significant represented by the 3 stars following the p value.

The model equation for these two variables is Loves = 14975.9 + 4920.8*Exclusivity

The intercept of this equation, 14,975.9 means that when exclusivity is 0 (aka it is not exclusive to Sephora) the product has 14,976 loves ON AVERAGE. The slope of this equation, 4,920.8 means that for every increase in exclusivity the amount of loves increases by 4,921 loves ON AVERAGE. Now since this is a binary variable there is only two options, 0 or 1. Essentially this equation gives you the average number of loves when the product is exclusive or not. If you use anything between 0 or 1, anything less than 0, or anything greater than 1 the result you get will mean nothing because the product either has to be exclusive or not. A product cannot be half exclusive or 1.5 exclusive etc.

Thank you for following along on this journey to using linear regression with my Sephora website dataset!!!

Leave a comment

Design a site like this with WordPress.com
Get started