Chapter 2. Correlation: Correlation
Direction of a Linear Relationship: Covariance
To determine the direction of the linear relationship between two variables, calculate the covariance.
Definition
The covariance measures the direction of the linear relationship between two quantitative variables.
The sample covariance between two variables and is denoted .
A positive covariance indicates the variables have a positive linear relationship. A negative covariance indicates the variables have a negative linear relationship.
Formulas
Computation of the Sample Covariance in R
To compute the sample covariance between two variables and in Excel, make use of the following function:
COVARIANCE.S(array1, array2)
- array1 : The range of cells containing the values of variable .
- array2 : The range of cells containing the values of variable .
To compute the sample covariance between two variables and in R, make use of the following function:
cov(x, y)
- x: The numeric vector that contains the values for variable
- y: The numeric vector that contains the values for variable
To calculate the covariance between two variables and , multiply the deviation score with respect to by the deviation score with respect to for each case in the dataset.
If both and lie on the same side of their respective mean, then the resulting product will be positive, specifically:
- If both scores (,) lie their respective means then both deviation scores are but their product will be positive.
- If both scores (, ) lie their respective means then both deviation scores are and so is their product.
If the scores lie on opposite sides of their respective means, then one deviation score will be negative (,) and the other will be positive (,) and the resulting product will be negative.
These products are then averaged and the resulting measure is called the covariance.
Interpreting the sign of the covarianceThe sign of the covariance indicates the direction of the linear relationship:
- If , then and are said to have a positive linear relationship.
- If , then and are said to have a negative linear relationship.
- If , then and are said to be linearly unrelated.
Interpreting the magnitude of the covarianceAlthough the sign of the covariance is a good measure of the direction of the linear relationship between two variables, the magnitude of the covariance is not a good measure of the strength of the relationship. This is because the magnitude of the covariance is heavily dependent on the magnitude of the variables.
Suppose we have a dataset containing the measurements of two variables and . Both of these variables were originally measured in meters. We calculate the covariance between these two variables and find a value of .
Now suppose we change our mind and decide we want to express the measurements of and in centimeters instead. To do so, we multiply all the values in the dataset by . We then recalculate the covariance and find a value of .
By multiplying each value in the dataset with a factor , the covariance increased by a factor . This illustrates why the covariance is a poor measure of the strength of the relationship between two variables. Multiplying or dividing all values in our dataset by some value should not affect our measurement of the strength of the relationship between variables.
Consider the following pairs of data points:
Calculate the sample covariance between and .
First calculate the means of variables and :
Now that the means are known, the values of , and can be calculated:
With this information, the sample covariance can be calculated:
Or visit omptest.org if jou are taking an OMPT exam.