Jump to content

Algebra/Chapter 1/Statistics

From Wikibooks, open books for an open world
(Redirected from Algebra/Statistics)
Rounding and Estimation Algebra
Chapter 1: Elementary Arithmetic
Section 12: Introduction to Statistics
Problem Solving

1.12: Introduction to Statistics


Populations and Samples

[edit | edit source]

Categorizing Data

[edit | edit source]

Sampling Methods

[edit | edit source]

Experiments

[edit | edit source]

Measures of Center and Spread

[edit | edit source]

The following three numbers represent 3 different ways to think about the average value of your set.

Mean - This is what we usually think of as the "average" of a data set. The mean can be found by summing all the values in the data set and dividing by the size of the data set (that is number of elements in the set). In mathematical notation,

For example: Suppose 1, 2, 4, 6, 8, 9 is our data set then the sum is 1 + 2 + 4 + 6 + 8 + 9 = 30 and there are 6 elements in the data set, so the mean is 30/6 = 5.

The mean, while a very useful statistic, has its flaws. Notably, its value may be heavily influenced by outliers - numbers in a data set which are significantly higher or lower than the majority of the data. It is often preferable to use the median instead to describe such data sets.

Median - This is the middle of our data set. To find the median you must first put your data values in numerical order (say, from smallest to largest). If you have an odd number of elements in your data set there will be exactly one number in the middle, this number is the median. If you have an even number of elements in your data set then the median is the average of the middle two numbers. et For example. If our data set was 2, 2, 3, 4, 4, 5, 6, 7, 8, 9, 12, 13, 16, 22 is our data s data set. Since it has an even number of elements, we have to take the mean of the middle two, in this case 6 and 7, so the median is 6.5.

Mode - Mode refers to how many times a number or numbers occur in a data set. Since mean, median, and mode often are confused with each other, an easy way to remember mode is 'most often'. The first two letters in mode are 'm' and 'o', imagine this stands for 'most often' to help you remember. In the case that two or more different values are tied for the most number of repeats then that data set is said to have multiple modes. If your asked to find the mode of a data set with multiple modes, then all of the modes should be listed. If no element of the data repeats, then there is no mode.

For example. Suppose 1, 2, 2, 2, 3, 3, 4, 5, 5, 5, 7 is our data set, then the mode would be both 2 and 5. They both occur three times and three is the maximum number of repeats in our data set.

The following quantity tell us how spread out our data set is.

Range - The difference between the largest and smallest numbers in our data set. Notice this means the range is never negative.

Standard Deviation -

Variance -

Examples

[edit | edit source]

Mean

Let's look at the following data set:

Data Values: 10, 13, 4, 7, 9 so n = 5

Now add the values together:

10 + 13 + 4 + 7 + 9 = 43

   43 / 5 = 8.6

Mean = 8.6


Median

Case 1:

Data Values: 10, 13, 4, 7, 8 so n = 5

Numerical Order: 4, 7, 8, 13, 10

Since 8 is the middle number,

Median = 8

Case 2:

Data Values: 10, 13, 4, 7, 8, 10 so n = 6

Numerical Order: 4, 7, 8, 10, 10, 13

Middle Numbers: 8 and 10

Find Mean: 8 + 10 = 18

          18 / 2 = 9

Median = 9


Mode

Data Values: 10, 13, 4, 7, 8, 10

10 is in the data set twice.

Mode = 10

Data Values: 4, 9, 13, 18, 4, 2, 9, 4, 13, 8, 9

4 and 9 both have three data values.

Mode = 4, 9


Range

Data Values: 10, 13, 4, 7, 8

Numerical Order: 4, 7, 8, 10, 13

Difference of last and first: 13 - 4 = 9

Range = 9

Box and Whisker Plots

[edit | edit source]