Statistics: Introduction
Population vs Sample
The
population includes all objects of interest whereas the sample is only a
portion of the population. Parameters are associated with populations and
statistics with samples. Parameters are usually denoted using Greek letters
(mu, sigma) while statistics are usually denoted using Roman letters (x, s).
There
are several reasons why we don't work with populations. They are usually large,
and it is often impossible to get data for every object we're studying.
Sampling does not usually occur without cost, and the more items surveyed, the
larger the cost.
We
compute statistics, and use them to estimate parameters. The computation is the
first part of the statistics course (Descriptive Statistics) and the estimation
is the second part (Inferential Statistics)
Discrete vs Continuous
Discrete
variables are usually obtained by counting. There are a finite or countable
number of choices available with discrete data. You can't have 2.63 people in
the room.
Continuous
variables are usually obtained by measuring. Length, weight, and time are all
examples of continous variables. Since continuous variables are real numbers,
we usually round them. This implies a boundary depending on the number of
decimal places. For example: 64 is really anything 63.5 <= x < 64.5.
Likewise, if there are two decimal places, then 64.03 is really anything 63.025
<= x < 63.035. Boundaries always have one more decimal place than the
data and end in a 5.
Levels of Measurement
There
are four levels of measurement: Nominal, Ordinal, Interval, and Ratio. These go
from lowest level to highest level. Data is classified according to the highest
level which it fits. Each additional level adds something the previous level
didn't have.
 Nominal is the lowest level. Only names are meaningful here.
 Ordinal adds an order to the names.
 Interval adds meaningful differences
 Ratio adds a zero so that ratios are meaningful.
Types of Sampling
There
are five types of sampling: Random, Systematic, Convenience, Cluster, and
Stratified.
 Random sampling is analogous to putting everyone's name into a hat and drawing out several names. Each element in the population has an equal chance of occuring. While this is the preferred way of sampling, it is often difficult to do. It requires that a complete list of every element in the population be obtained. Computer generated lists are often used with random sampling. You can generate random numbers using the TI82 calculator.
 Systematic sampling is easier to do than random sampling. In systematic sampling, the list of elements is "counted off". That is, every kth element is taken. This is similar to lining everyone up and numbering off "1,2,3,4; 1,2,3,4; etc". When done numbering, all people numbered 4 would be used.
 Convenience sampling is very easy to do, but it's probably the worst technique to use. In convenience sampling, readily available data is used. That is, the first people the surveyor runs into.
 Cluster sampling is accomplished by dividing the population into groups  usually geographically. These groups are called clusters or blocks. The clusters are randomly selected, and each element in the selected clusters are used.
 Stratified sampling also divides the population into groups called strata. However, this time it is by some characteristic, not geographically. For instance, the population might be separated into males and females. A sample is taken from each of these strata using either random, systematic, or convenience sampling

Sampling Risks
There are two types of sampling risks, first is the risk of incorrect acceptance of the research hypothesis and the second is the risk for incorrect rejection. These risks pertain to the possibility that when a test is conducted to a sample, the results and conclusions may be different from the results and conclusions when the test is conducted to the entire population.The risk of incorrect acceptance pertains to the risk that the sample can yield a conclusion that supports a theory about the population when it is actually not existent in the population. On the other hand, the risk of incorrect rejection pertains to the risk that the sample can yield a conclusion that rejects a theory about the population when in fact, the theory holds true in the population.Comparing the two types of risks, researchers fear the risk of incorrect rejection more than the risk of incorrect acceptance. Consider this example; an experimental drug was tested for its debilitating side effects. With the risk of incorrect acceptance, the researcher will conclude that the drug indeed has negative side effects but the truth is that it doesn’t. The entire population will then abstain from taking the drug. But with the risk of incorrect rejection, the researcher will conclude that the drug has no negative side effects. The entire population will then take the drug knowing that it has no side effects but all of them will then suffer the consequences of the mistake of the researcher.