If their \(x\) and \(y\) values were both above the mean then this product would be positive. If their x and y values were both below the mean this product would be positive. If one value was above the mean and the other was below the mean this product would be negative. We would multiply each case's \(z_x\) by their \(z_y\).
To use this formula we would first compute the \(z\) score for every \(x\) and \(y\) value. You should always be using technology to compute this value.įirst, we'll look at the conceptual formula which uses \(z\) scores. Note that you will not have to compute Pearson's \(r\) by hand in this course. These formulas are presented here to help you understand what the value means. There are a number of different versions of the formula for computing Pearson's \(r\). You should get the same correlation value regardless of which formula you use. In addition to the correlation changing, the y-intercept changed from 4.154 to 70.84 and the slope changed from 6.661 to 1.632.ģ.4.2.1 - Formulas for Computing Pearson's r 3.4.2.1 - Formulas for Computing Pearson's r Note that the scale on both the x and y axes has changed. Now, the correlation between \(x\) and \(y\) is lower (\(r=0.576\)) and the slope is less steep. In Figure 1 the correlation between \(x\) and \(y\) is strong (\(r=0.979\)). In Figure 2 below, the outlier is removed. Influential outliers are points in a data set that increase the correlation coefficient. Figure 1 below provides an example of an influential outlier.