[A, SfS] Chapter 2: Probability: 2.4: Random Variables
Random Variables
Random Variables
In this lesson we learn about discrete and continuous random variables, their associated probability distributions, and computation of their means and variances
Random Variable
Given an experiment with sample space consisting of all possible outcomes, a random variable associates a real number to each outcome in .
Although we call a r.v. a “variable”, it is actually a function, whose domain is and whose range is the subset of the real numbers to which the outcomes in are assigned by the function.
Notation
We usually denote a random variable (r.v.) with an upper-case letter from the latter part of the Latin alphabet, such as or .
The range of r.v. is denoted .
Discrete Random Variable
If is either finite or countable, then we say that is a discrete random variable.
For example, suppose an experiment involves flipping a coin times. In this case, the sample space is:
Let the r.v. be defined as the number of Heads in the coin flips. Then the range of is:
Since is finite, is a discrete r.v..
As you can see, each of the outcomes in is assigned by to one of the values in .
As a second experiment, suppose an experiment involves counting the number of radioactive particles emitted by a specimen of uranium in one hour.
Then the sample space is:
Let the r.v. be defined as the number of radioactive particles emitted by the specimen in one hour. Since is already a set of real numbers, it is convenient to define the range of exactly the same way.
Thus the range of
is a countable set. So is a discrete r.v..
Continuous Random Variable
If the range of a r.v. is uncountable, we say that is a continuous random variable.
Suppose an experiment involved weighing (in grams) a specimen of rock taken from a geologic excavation site. Then the sample of this experiment is:
Let the r.v. be defined as the weight, in grams, of the rock. Since is already a set of real numbers, it is convenient to define the range of exactly the same way.
Thus the range of
Probability Distribution of a Discrete Random Variable
For a discrete r.v. , we define its probability distribution as the allocation of probability to each value in .
Let denote one of the possible values for which is in the set .
Then the probability of the event is the sum of the probabilities of all outcomes in which are assigned to the value of .
Probability Mass Function
The Probability Mass Function (pmf) of a discrete r.v. is a function that gives us for each in .
This function could be given in a table if is not too large, or it could be given as a formula.
To be a legitimate pmf, it must be true that
for all in and
Suppose an experiment involves flipping a coin times. The sample space of this experiment is:
Let the r.v. be defined as the number of Heads in the coin flips. Thus the range of is:
The pmf of can be given in a table as:
The first row shows the values in , and the second row shows the probability that for each in , i.e., the probability mass function .
As you can easily see, the sum of the four probabilities is .
As a second example, suppose an experiment involves counting the number of radioactive particles emitted by a specimen of Uranium in one hour.
Let the discrete r.v. be defined as the number of radioactive particles emitted by the specimen in 1-hour. Thus the range of is:
The pmf of is given as a formula in this case:
for in .
It is not so obvious that the sum of over all values in equals , but it is:
Here we used the Maclaurin series expansion for which you learned in calculus:
Sometimes we are concerned not with the probability that a r.v. equals a specific value but with the probability that a r.v. is within a range of values.
Cumulative Distribution Function
For this, we define the cumulative distribution function for each in the real numbers.
Based on this definition, is a non-decreasing function, i.e., if then .
Since is a probability, its value must be in the interval . Consequently, as , , and as , .
Discrete Cumulative Distribution Function
For a discrete r.v. , its cdf is .
Here we have the understanding that if is not in . Consequently, the graph of will be a step function, with “jumps” at the real numbers which are in .
For example, recall the experiment involving flipping a coin times, in which the r.v. was the same as the number of Heads in the coin flips.
The pmf of was given in a table as:
From this table we can see that for every :
At we accumulate a probability of , so for :
At we accumulate an additional probability of to be added to the previous , so for :
At we accumulate an additional probability of to be added to the previous , so for :
At we accumulate an additional probability of to be added to the previous , after which no further probability can be accumulated, so for :
We would write this as a piecewise-defined function:
The graph of for this example looks like:
As you can see, as we go from left to right we are climbing up the steps from to , with the jumps occurring at each in .
Every random variable can be described with two parameters, its mean and its variance. We previously used these terms to describe the distribution of a quantitative variable on a population and on a sample. This is a different use of the terms, so you will need to keep the three uses sorted in your mind.
For a r.v., the mean signifies the center of the probability distribution, i.e., the value around which observations of the r.v. are centered.
The variance signifies how dispersed about the mean the observations of the r.v. tend be, i.e., whether the values are generally close to the mean or generally spread out around the mean.
Mean of a Discrete Random Variable
For a discrete r.v. with pmf , the mean is denoted (or sometimes just if there is no danger of confusion), and is computed as:
The mean of is also known as the expected value of (or the expectation of ), in which case it is denoted as .
Variance and Standard Deviation of a Discrete Random Variable
The variance of a discrete r.v. with pmf is denoted (or, equivalently ), and is computed as:
With a little effort one can obtain an alternative form of this formula which is usually much easier to use:
The standard deviation of a r.v. is simply the square root of the variance, and is denoted .
If there is no danger of confusion, the variance and standard deviation can be denoted and respectively.
Recall again the experiment involving flipping a coin times, in which the r.v. was defined as the number of Heads in the coin flips.
The pmf of was given in a table as:
Hence the mean of is:
And the variance of is:
Thus the standard deviation of is:
Probability Density Function
If a random variable is continuous, it probability distribution is defined for all real numbers by a function called its probability density function (pdf).
To be a legitimate pdf, we must have
for all , and
Since the range of is in this case uncountable, it does not make sense to talk about . We can only compute probabilities for the r.v. to fall into an interval.
This is done by computing the area under the graph of its pdf above the interval, which usually involves integration. That is, for any two real numbers and with :
Important: For all three formulas above, the signs and can be replaced by and , since the area under the graph of is not altered by exclusion of the endpoints. However, this is not true for discrete r.v.s! Be sure to keep this distinction clear in your mind.
Cumulative Distribution Function
The cumulative distribution function (cdf) for a continuous r.v. is still defined as , but now is computed using integration.
That is, for every real number , .
For example, the time (in seconds) until the next emission of a radioactive particles from a specimen, after the previous emission, can be modeled as the continuous r.v. with pdf for , and for .
The graph of the pdf is:
Clearly it is true that for all .
Also,
so is a legitimate pdf.
What is the probability that the time until the next omission is more than seconds?
The cdf of is if and if .
We write this cdf as a piecewise-defined function:
The graph of the cdf is:
As you can see, the graph approaches as , with a smooth curve on .
Mean and Variance of a Continuous Random Variable
For a continuous r.v. with pdf , the mean and variance of are computed as:
and
With some algebraic manipulation, the formula for the variance becomes:
which is generally much easier to compute.
As with discrete r.v.s, the mean of is also called the expectation of or the expected value of , denoted , and the variance of can also be denoted .
The standard deviation of is the square root of the variance, denoted .
Again, the subscripts can be omitted if there is no danger of confusion.
For example, recall that the time (in seconds) until the next emission of a radioactive particle from a specimen, after the previous emission, can be modeled as the continuous r.v. with pdf:
for , and
for .
What is the average time until the next omission? That would be the mean of , which is:
This requires the technique of integrating by parts, which results in . Note that the lower limit of the integral is rather than , since for .
The variance of is:
which requires two iterations of integration by parts.
After a bit of effort, you get the result .
Then the standard deviation of is , which is the same as the mean for this example.
As with distributions of a variable on a population or on a sample, we can also refer to the quantiles of the probability distribution of a random variable. However, most of the quantiles will only make sense for a continuous r.v..
Quantiles of a Continuous Probability Distribution
For any number in (0,1), the th quantile of a continuous random variable with cdf is the number that satisfies .
If then can also be called the th percentile.
The median of is the quantile of , i.e., the th percentile of .
We finish this section with an important result in probability that answers the question:
What is the probability that an observation of a r.v. falls more than standard deviations away from the mean?
The answer to this question depends on the probability distribution of the r.v., but we can derive an upper bound on this probability. This is given by Chebyshev’s Inequality.
Chebyshev’s Inequality
Chebyshev’s Inequality states that for any r.v. with mean and standard deviation , the following property holds:
So, for example, the probability that an observation of a r.v. falls more than standard deviations away from its mean can be no larger than .
And the probability that an observation of a r.v. falls more than standard deviations away from its mean can be no larger than .
Or visit omptest.org if jou are taking an OMPT exam.