# Probability and Research Design

## Definition(s) of probability

We

could choose one of several technical definitions for probability, but for our

purposes it refers to an assessment of the likelihood of the various possible

outcomes in an experiment or some other situation with a **“random”**

outcome. Note that in probability theory the term “outcome” is used in a more

general sense than the outcome vs. explanatory variable terminology that is

used in the rest of this book.

In probability theory the term “outcome” applies

not only to the “outcome variables” of experiments but also to **“explanatory
variables”** if their values are not fixed. For example, the dose of a drug is

normally fixed by the experimenter, so it is not an outcome in probability

theory, but the age of a randomly chosen subject, even if it serves as an

explanatory variable in an experiment, is not “fixed ” by the experimenter, and

thus can be an “outcome” under probability theory.

The

collection of all possible outcomes of a particular random experiment (or other

well defined random situation) is called the sample space, usually abbreviated

as S or Ω (omega). The outcomes in this set (list) must be exhaustive (cover

all possible outcomes) and mutually exclusive (non-overlapping), and should be

as simple as possible.

We

use the term event to represent any subset of the sample space. One way to

think about events is that they can be defined before the experiment is carried

out, and they either occur or do not occur when the experiment is carried out.

In probability theory we learn to compute the chance that events like **“odd side
up”** will occur based on assumptions about things like the probabilities of the

elementary outcomes in the sample space.

Technically,

this mapping is called a random variable, but more commonly and informally we

refer to the unknown numeric outcome itself (before the experiment is run) as a

“random variable”. Random variables commonly are represented as upper case

English letters towards the end of the alphabet, such as Z, Y or X. Sometimes

the lower case equivalents are used to represent the actual outcomes after the

experiment is run.

Random

variables are maps from the sample space to the real numbers, but they need not

be one-to-one maps. For example, in the die experiment we could map all of the

outcomes in the set {1du, 3du, 5du} to the number 0 and all of the outcomes in

the set {2du, 4du, 6du} to the number 1, and call this random variable Y.

If we

call the random variable that maps to 1 through 6 as X, then random variable Y

could also be thought of as a map from X to Y where the odd numbers of X map to

0 in Y and the even numbers to 1. Often the term **transformation **is used

when we create a new random variable out of an old one in this way. It should

now be obvious that many, many different random variables can be

defined/invented for a given experiment.

A

few more basic definitions are worth learning at this point. A random variable

that takes on only the numbers 0 and 1 is commonly referred to as an **indicator
(random) variable. **It is usually named to match the set that corresponds to

the number 1. So in the previous example, random variable Y is an indicator for

even outcomes.

For any random variable, the term **support **is used to

refer to the set of possible real numbers defined by the mapping from the

physical experimental outcomes to the numbers. Therefore, for random variables

we use the term “event” to represent any subset of the support.

Ignoring

certain technical issues, probability theory is used to take a basic set of

assigned (or assumed) probabilities and use those probabilities (possibly with

additional assumptions about something called independence) to compute the

probabilities of various more complex events.

**The core of probability theory is making predictions about the
chances of occurrence of events based on a set of assumptions about the
underlying probability processes.**

One

way to think about probability is that it quantifies how much we can know when

we cannot know something exactly. Probability theory is deductive, in the sense

that it involves making assumptions about a random (not completely predictable)

process, and then deriving valid statements about what is likely to happen

based on mathematical principles.

For this course a fairly small number of

probability definitions, concepts, and skills will suffice.

For

those who are unsatisfied with the loose definition of probability above, here

is a brief description of three different approaches to probability, although

it is not necessary to understand this material to continue through the

chapter. If you want even more detail, I recommend Comparative Statistical

Inference by Vic Barnett.

Valid

probability statements do not claim what events will happen, but rather which

are likely to happen. The starting point is sometimes a judgment that certain

events are a priori equally likely.

Then using only the additional assumption

that the occurrence of one event has no bearing on the occurrence of another

separate event (called the assumption of independence), the likelihood of

various complex combinations of events can be worked out through logic and

mathematics. This approach has logical consistency, but cannot be applied to

situations where it is unreasonable to assume equally likely outcomes and

independence.

A

second approach to probability is to define the probability of an outcome as

the limit of the long-term fraction of times that outcome occurs in an

ever-larger number of independent trials. This allows us to work with basic

events that are not equally likely, but has a disadvantage that probabilities

are assigned through observation.

Nevertheless this approach is sufficient for

our purposes, which are mostly to figure out what would happen if certain

probabilities are assigned to some events.

A

third approach is subjective probability, where the probabilities of various

events are our subjective (but consistent) assignments of probability. This has

the advantage that events that only occur once, such as the next presidential

election, can be studied probabilistically.

Despite the seemingly bizarre

premise, this is a valid and useful approach which may give different answers

for different people who have different beliefs, but still helps calculate your

rational but personal probability of future uncertain events, given your prior

beliefs.

Regardless

of which definition of probability you use, the calculations we need are

basically the same. First we need to note that probability applies to some

well-defined unknown or future situation in which some outcome will occur, the

list of possible outcomes is well defined, and the exact outcome is unknown.

If

the outcome is categorical or discrete quantitative , then each possible

outcome gets a probability in the form of a number between 0 and 1 such that

the sum of all of the probabilities is 1.

This indicates that impossible

outcomes are assigned probability zero, but assigning a probability zero to an

event does not necessarily mean that that outcome is impossible (see below).

(Note that a probability is technically written as a number from 0 to 1, but is

often converted to a percent from 0% to 100%. In case you have forgotten, to

convert to a percent multiply by 100, eg, 0.25 is 25 % and 0.5 is 50% and 0.975

is 97.5%.)

**“Every valid probability must be a number between 0 and 1 (or a
percent between 0% and 100%).”**

We

will need to distinguish two types of random variables. Discrete random

variables correspond to the categorical variables plus the discrete

quantitative variables. Their support is a (finite or infinite) list of numeric

outcomes, each of which has a non-zero probability. (Here we will loosely use

the term “support” not only for the numeric outcomes of the random variable

mapping, but also for the sample space when we do not explicitly map an outcome

to a number.)

Examples

of discrete random variables include the result of a coin toss (the support

using curly brace set notation is {H,T}), the number of tosses out of 5 that

are heads ({0, 1, 2, 3, 4, 5}), the color of a random person’s eyes ({blue,

brown, green, other}), and the number of coin tosses until a head is obtained

({1, 2, 3, 4, 5, . . .} ). Note that the last example has an infinitely sized

support.

Continuous

random variables correspond to the continuous quantitative variables. Their

support is a continuous range of real numbers (or rarely several disconnected

ranges) with no gaps. When working with continuous random variables in

probability theory we think as if there is no rounding, and each value has an

infinite number of decimal places.

In practice we can only measure things to a

certain number of decimal places, actual measurement of the continuous variable

“length” might be 3.14, 3.15, etc., which does have gaps. But we approximate

this with a continuous random variable rather than a discrete random variable

because more precise measurement is possible in theory.

A

strange aspect of working with continuous random variables is that each

particular outcome in the support has probability zero, while none is actually

impossible. The reason each outcome value has probability zero is that

otherwise the probabilities of all of the events would add up to more than 1.

So for continuous random variables we usually work with intervals of outcomes

to say, eg , that the probability that an outcome is between 3.14 and 3.15

might be 0.02 while each real number in that range, eg, π (exactly), has zero

probability. Examples of continuous random variables include ages, times,

weights, lengths, etc. All of these can theoretically be measured to an

infinite number of decimal places.

It is also possible for a random variable to be a mixture of

discrete and continuous random variables, eg, if an experiment is to flip a

coin and report 0 if it is heads and the time it was in the air if it is tails,

then this variable is a mixture of the discrete and continuous types because

the outcome “0” has a non-zero (positive) probability, while all positive

numbers have a zero probability (though intervals between two positive numbers

would have probability greater than zero.)