Statistics/Introduction/What is Statistics
Your company has created a new drug that may cure arthritis. How would you conduct a test to confirm the drug's effectiveness?
The latest sales data have just come in, and your boss wants you to prepare a report for management on places where the company could improve its business. What should you look for? What should you not look for?
You and a friend are at a baseball game, and out of the blue he offers you a bet that neither team will hit a home run in that game. Should you take the bet?
You want to conduct a poll on whether your school should use its funding to build a new athletic complex or a new library. How many people do you have to poll? How do you ensure that your poll is free of bias? How do you interpret your results?
A widget maker in your factory that normally breaks 4 widgets for every 100 it produces has recently started breaking 5 widgets for every 100. When is it time to buy a new widget maker? (And just what is a widget, anyway?)
These are some of the many real-world examples that require the use of statistics. How would you approach the problem statements? There are some stepwise human algorithms, but is there a general problem statement?
- "Find possible solutions, decide on a solution, plan the solution, implement the solution, learn from the results for future solutions (or re-solution)."
- "SOAP - subjective - the problem as given, objective - the problem after examination, assessment - the better defined problem, plan - decide if guidelines to management already exist, and blueprint the solution for this case, or generate a risk-minimizing, new solution path".
- "HAMRC - hypothesis, aim, methodology, results, conclusion" - the concept that there is no real difference is the null hypothesis.
Then there is the joke that compares the different ways of thinking:
"A physicist, a chemist and a statistician were working collaboratively on a problem, when the wastepaper basket spontaneously combusted (they all swore they had stopped smoking). The chemist said, 'quick, we must reduce the concentration of the reactant which is oxygen, by increasing the relative concentration of non-reactive gases, such as carbon dioxide and carbon monoxide. Place a fire blanket over the flames.' The physicist, interjected, 'no, no, we must reduce the heat energy available for activating combustion ; get some water to douse the flame'. Meanwhile, the statistician was running around lighting more fires. The others asked with alarm, 'what are you doing?'. 'Trying to get an adequate sample size'."
General Definition
[edit | edit source]Statistics, in short, is the study of data. It includes descriptive statistics (the study of methods and tools for collecting data, and mathematical models to describe and interpret data) and inferential statistics (the systems and techniques for making probability-based decisions and accurate predictions).
Etymology
[edit | edit source]As its name implies, statistics has its roots in the idea of "the state of things". The word itself comes from the ancient Latin term statisticum collegium, meaning "a lecture on the state of affairs". Eventually, this evolved into the Italian word statista, meaning "statesman", and the German word Statistik, meaning "collection of data involving the State". Gradually, the term came to be used to describe the collection of any sort of data.
Statistics as a subset of mathematics
[edit | edit source]As one would expect, statistics is largely grounded in mathematics, and the study of statistics has lent itself to many major concepts in mathematics: probability, distributions, samples and populations, the bell curve, estimation, and data analysis.
Up ahead
[edit | edit source]Up ahead, we will learn about subjects in modern statistics and some practical applications of statistics. We will also lay out some of the background mathematical concepts required to begin studying statistics.