CIS523 Computational Mathematics for Bioinformatics


<< back to courseware demo page
- This online lecture is for demonstration purposes

 

Lecture I

Introduction to Probability and Random Variables
 

About This Lecture

This lecture introduces concepts in probability and random variables.


 
 
 
 
 
 
 

 

Lecture Menu

About this Lecture

Learning Objectives

Probability and its axioms

Random variables

Binomial distributions

Summary

Review Questions

Practice Test & Answers

Required Readings
 
 

Learning Objectives

 After completing Lecture I, you will be familiar with:

  • axioms of probability theory
  • random variables
  • probability distributions of random variables
  • binomial experiments 
  • binomial distributions
Probability and its axioms
 

There are various approaches to probability theory that have been used.  These approaches are:

  • classical
  • relative frequency
  • subjective
  • axiomatic


The axiomatic approach is consistent with the others and allows for the application of powerful mathematical tools to probability theory.  Here presented are the elements of the axiomatic approach.

Kolmogorov introduced the axiomatic approach to probability as an improvement over and generalization of previous approaches.
 

The Axiomatic Approach to Probability
 

An experiment is a process which can be assigned a number (possibly infinite) of outcomes.

A sample space is the set of all possible outcomes of a given experiment.

In some of the definitions below the term "measurable" is used.  Its definition is beyond the scope of this course, but it is needed for certain infinite sample spaces to insure that probabilities are meaningful.  In the case of finite sample spaces all sets and functions may be taken to be measurable.

An event is a measurable subset of a sample space.

Two events are disjoint if their intersection is the empty set.
 

Axioms of Probability

For any event A, a subset of the sample space S, we assign a number P(A), called the probability of the event A. This number satisfies the following three axioms of probability.

1.  P(A) >= 0
2.  P(S) = 1
3.  If A and B are disjoint events, P(A or B) = P(A)+P(B)

(Note that (3) states that if A and B are mutually exclusive (M.E.) (meaning P(A and B) = 0) events, the  probability of their union is the sum of their probabilities.)

There is a fourth axiom needed when dealing with infinite families of events.

If {Ai} is a family of events satisfying P(Ai and Aj) = 0 for i not equal to j, then:

4.  The probabilty of the union of the Ai is equal to the sum of the probabilities P(Ai)

(The union and sum in (4) will be infinite if the family {Ai} is infinite.)
 

Note that by axiom 2 no probability can ever exceed 1 since no event can be a proper superset of the sample space.
 
 
 



Random variables
 

An intuitive introduction to random variables

Random variables and their distributions may informally be thought of as described below.  Formal definitions will be given later.
 

Random Variable 

  •  a variable (typically represented by x) that has a single numerical value, determined by chance, for each outcome of a procedure


Probability Distribution

  •  a graph, table, or formula that gives the probability for each value of the random variable
Discrete random variable
  •  has either a finite number of values or countable number of values, where ‘countable’ refers to the fact that there might be infinitely many values, but they result from a counting process.


Continuous random variable 

  •  has infinitely many values, and those values can be associated with measurements on a continuous scale with no gaps or interruptions.


Requirements for Probability Distributions

  • S P(x) = 1  where x assumes all possible values
  • 0 < P(x) < 1  for every value of x


Mean, Variance and Standard Deviation of a Probability Distribution
 

  • Mean:                         µ = S [x • P(x)]
  • Variance:                 s 2 = S [(x - µ)2 • P(x)]
  • Variance:           s 2 = [S x 2 • P(x)] - µ 2   (shortcut)


The standard deviation s is the square root of the variance.

The mean, or expected value, E of a random variable is equal to the mean of its probability distribution.

For a discrete random variable:

  • E = µ = S [x • P(x)]


Example:

If a gambling game costs $1 to play and the winning payoff is $500, where the probability of winning is 1/1000
or .001, the net gain for a loss is -$1 and for a win $499.

  • This gives rise to a random variable with values -1 and 499.
  • The corresponding probabilities are .999 and .001.

 
Event x
P(x)
x • P(x)
Win
499
.001
0.499
Lose
-1
.999
-0.999
Total
-0.50

The mean = the expected value = E = -0.50, so:

  • The expected winnings are -$0.50.
  • The expected loss is $0.50.

 

Formal Definitions


 

A random variable is a measurable function from a sample space to the set of real numbers.
 

A random variable has a probability distribution which may be specified by either of two related functions:

  • its cumulative distribution function (cdf) or
  • its probability density function (pdf).


The cumulative distribution function (cdf) F of a random variable X is defined as follows:

F(t) = P(X <= t)

So the value of the cdf at a number t is the probability the random variable has a value less than or equal to t.
 

The probability density function (pdf) f of a random variable X is defined as follows:

  • If X is discrete:
    • f(t) = P(X = t)
  • If X is continuous:
    • f(t) = F'(t)
      • F' is the derivative, or rate of change, of F
So in the continuous case the pdf gives the marginal cumulative probability, or the rate the cumulative probability is increasing, at the value t.
 
 


Binomial distributions
 
 

Binomial Experiments

  • The experiment must have a fixed number of trials.
  • The trials must be independent.  (The outcome of any individual trial doesn’t affect the probabilities in the other trials.)
  • Each trial must have all outcomes classified into two categories (called success and failure).
  • The probabilities must remain constant for each trial.


Notation for Binomial Probability Distributions

  • n  = the fixed number of trials
  • x  = specific number of successes in n trials
  • p  = probability of success in one of n trials 
  • q  = probability of failure in one of n trials (q = 1 - p ) 
  • P(x) = probability of getting exactly x successes among n trials


Binomial probability formula (the pdf of a binomially distributed random variable)

  • P(x) =  nCx • px  •  qn-x
Example

This is a binomial experiment where: 

  • n = 5
  • x = 3
  • p = 0.90
  • q = 0.10
Using the binomial probability formula to solve:
  • P(3) =   5C3  • 0.93 •  0.12   = 0.0729


Tables, calculators and computer software are also useful for solving binomial probability problems.
 
 

 

Summary

This lecture has introduced concepts and axioms for probability, random variables, probability distributions of random variables, binomial experiments and binomial distributions.
 

Review Questions

1.  What is a sample space?

2.  What is an event?

3.  How may we informally think of a random variable?

4.  What is the formal definition of a random variable?

5.  What are the cdf and pdf of a random variable?
 

Practice Test & Answers

1.  Disjoint events A and B have probabilities P(A) = .3 and P(B) = .4.  What is P(A or B)?

2.  Why cannot disjoint events A and B have probabilities P(A) = .7 and P(B) = .4?

3.  What is an appropriate sample space S for the experiment of tossing two coins and observing whether heads or tails came up on each coin?

4.  A random variable X has pdf given by f(0) = .7, f(1) = .2 and f(2) = .1.  Find E(X).

5.  A continous random variable X has cdf F(t) = t2 on the interval [0,1].  What is the pdf of X?
 

ANSWERS

1.  .7

2.  P(A or B) would have to be 1.1 but this is impossible because probabilities cannot exceed 1.

3.  {HH, HT, TH, TT}

4.  E(X) = .4

5.  f(t) = 2t on [0,1]
 

Required Readings

Chapter 1 of the text.