New Stats 1 Tips
TIPS FOR MAKING NOTE SHEETS FOR CLASS
SQUEEZE AS MUCH INFO IN AS POSSIBLE:
ORGANIZE IT:
DON'T FORGET TO INCLUDE:
TRY IT OUT BEFORE THE EXAM
|
Exam 3 Formula Sheet
Here's a copy of the suggested formula sheet for Exam 3. Make sure you know how to use the formulas on this sheet for the exam. |
Defining Statistics & Statistical Thinking
Statistics can be defined as the science of data. The study of statistics is the universal process of data generation, analysis, presentation, and even how to interpret the data. This means statistics is not like any other math class you've ever taken before. The Urbandictionary.com put it simplest when they defined statistics as: "The math course that is essentially the lovechild mathematics and English... And sometimes psychology."(2011) Statistics is much more than gambling and surveys. Everyone uses statistics daily basis even without realizing it. Statistics is a course where the numbers are not nearly as important as the thought process used to generate the numbers. Essentially, the best statisticians are skeptical of all computations where raw data is not present and are looking for flaws in data generation preparation, analysis, and conclusions being made with the data. The following topics include common concerns people have when deciding which statistics are flawed. Prepare DataTo prepare the data for use you must consider the answers to a series of questions to avoid wasting time analyzing raw data that is flawed. The most important questions used for preparing data are context based questions: What do the data mean? What is the goal of the study? You should then, consider the source of the data: Is the source objective? Is the source biased? The key idea here is to be vigilant and skeptical of studies from sources that may be biased. Sampling Methods must be considered as well: Does the method chosen greatly influence the validity of the conclusion? Voluntary response (or self-selected) samples often have bias (those with special interest are more likely to participate). Are other methods are more likely to produce good results?
Analyze DataThe first step of actual data analysis is to create the appropriate graphs (covered in chapter 2). Once these graphs have been created, you should apply statistical methods (the rest of the book explains the statistical methods). Most of the formulas required to compute the numerical values are extremely daunting (some are not possible by hand), thus statisticians rely heavily on technology (computers, graphing calculators, tables). With technology, good analysis does not require strong math skills, but it does require using common sense and paying attention to sound statistical methods.
Make ConclusionsThe whole point of statistical analysis is to decide if the data is significant (different from normal data). Data that is statisticly significant will not happen based on coincidence. Occasionally data can be statistically significant without being practically significant (useful in the real world).
PEOPLE MISUSE STATISTICS Simply put, people dont understand statistics use them to prove points all of the time. When people don't use the correct methods their conclusions end up being flawed. Below is a list of the most common misuses of Statistics. Concluding that one variable causes the other variable when in fact the variables are only correlated or associated together (covered in Chapter 10). In essence, two variables that may seem to be related, are temperature and violent crimes (as it gets hotter outside, the number of violent crimes will increase). However, we cannot conclude the one causes the other based solely on the numerical calculation of the relationship between temperature and the number of violent crimes. There may be another factor involved (like discomfort) that explains the relationship. Which is where the mantra used in many social science classes "Correlation does not imply causation" comes from. For example, if Prof K. surveys 20 people to figure out which Walt Disney World restaurants people like eating at. and comes to the decision that the Be Our Guest restaurant is the most popular counter service restaurant in the Magic Kingdom. Thanks to Google.com Prof. K found that there are approximately 17 MILLION visitors to Walt Disney World every year. What does that say about the validity of Prof. K's data? LOADED QUESTIONS If survey questions are not worded well, the results can be misleading. In a famous Psychological Experiment Elizabeth Loftus & John Palmer found that by changing one word in a question could change the responses of people Click here for more info on this experiment In the experiment participants had to watch several videos of car accidents then each participant was asked "How fast the cars were going when they ____?" Several different words were used to complete the question and the responses changed based on the word used. When participants were asked "How fast were the cars going when they contacted?" the average response was 32 mph, yet when the participants were asked How fast were the cars going when they smashed?" the average response was 42 mph. That's a 10 mph difference based on the same video footage. Image Source: http://www.simplypsychology.org/loftus-results.jpg ORDER OF QUESTIONS The order of the questions can change the results as well. Sometimes, questions are unintentionally loaded by such factors as the
order of the items being considered. For Example: Would you say traffic contributes more or less to air pollution than industry? Results in: traffic - 45%; industry - 27% When the order is reversed the results change to: industry - 57%; traffic - 24% NONRESPONSES MISSING DATA PRESCISE NUMBERS Simply put, just because a number is exact, doesn't mean it hasn't been estimated. A number can be an estimate but should always be referred to as an estimate. Think about the many ways people can round any 4 digit number (i.e. 1,675.43 could be rounded to 2,000 or 1,700 or 1,680 these numbers are all correct yet they are all different estimates) Now, think about how you could be manipulated into buying one computer over another computer by someone saying one computer costs $1,700 vs $1680. Simply put, many people don't understand fractions or percentages and misuse them regularly.
PERCENTAGES Misleading or unclear percentages are sometimes used. Textbook Example – Continental Airlines ran an ad claiming “We’ve already improved 100% in the last six months” with respect to lost baggage. Does this mean Continental made no mistakes? |
Important things to remember when you get to Stats 2
Here are some VITAL concepts you should remember when you get to Stats 2. This information may be helpful while you are preparing for the final. |
Hypothesis Testing 101
Steps for Hypothesis testing: 1. Write down what’s given i.e. sample standard deviation, sample size, population proportion 2. Figure out what table & formula you should use 3. Draw the picture! Left tail is less than Right tail is greater than Two tail is equal to or not equal to 4. Use the tables to find the critical value and add it to the picture Use α to find the critical value in a one tailed test Use, 𝛼/2 to find the critical value in a two tailed test 5. Write the hypothesis a. The null hypothesis H0 always has an equal sign b. The alternative hypothesis has either a less than sign <, greater than sign> or not equal sign ≠ 6. Use formula to find the test statistic Z, t, X 2 etc. 7. Decide whether to reject or fail to reject the null hypothesis. If the test statistic falls in the shaded critical region, reject the null hypothesis If the test statistic does NOT fall in the shaded critical region, “Fail to reject” the null hypothesis 8. State your conclusion:
Source: Triola, M. F. (2003). Elementary Statistics (9th ed.), Pearson Education, Inc. |
Finding the mean and variance of a probability distribution
In English this means you need to multiply each x value by the corresponding probability and get the sum of the results.
In English this means you need to do the following: Note: remember the standard deviation is the square root of the variance.
Example 1: Find the mean and variance of the following probability distribution.
To find the mean of the distribution we need to add another vertical column onto our table and a total row at the bottom of our table. and compute the x*P(x) for each x value then get the total of the column as our mean.
Thus the mean of example 1 is 1.7 .
To find the variance of the distribution we need to add 3 new vertical columns onto our original table and a total row at the bottom of the table. The computations in each of the new columns are as follows:
In the first new column, subtract the mean from each x value (remember our mean is 1.7 )
In the second new column, square each answer from the first new column.
In the third new column, multiply each answer from the second new column by each probability and finally get the sum of the answers from this step.
|
Checking a probability distribution for validity
When you are asked if a probability distribution (table) is valid you need to answer 3 questions.
1. Does the sum of P(x) add up to any number other than 1 ? 2. Are there any negative probabilities? 3. Are there any probabilities larger than 1?
0.129+0.257+0.659+0.008+(-0.053) = 1.000 Since we said there are negative probabilities, example 1 is NOT a probability distribution. Example 2:
1. No, the sum of P(x) adds up to 1 0.2+0.2+0.2+0.2+0.2 = 1.000 2. No, there aren't any negative probabilities. The negative numbers are values for x, not the probability of x. 3. No, there aren't any probabilities larger than 1. Since we said No to all of the questions, example 2 is a probability distribution. Example 3:
1. Yes, the sum of P(x) adds up to 0.999 0.2+0.2+0.2+0.2+0.199 = 0.999 2. No, there aren't any negative probabilities. The negative numbers are values for x, not the probability of x. 3. No, there aren't any probabilities larger than 1. Since we said Yes to the first question, example 3 is NOT a probability distribution. |
Finding Weighted Means and Averages For Frequency Distributions
Find the mean of this frequency distribution: 1. The first thing you need to do is to find the midpoint for each class.
3. Finally divide the total of the Midpoint * frequency column by the total of the frequency column to get your mean. 1940 / 60 = 32.333333333... Always round to one value beyond what your original data was (since our classes were whole numbers we should round to 1 decimal place). So our final answer for the mean of this frequency distribution is 32.3. |
Calculator Shortcuts for graphs and histograms
To Enter data into a list: Press stat Press enter Put data into a list (remember the list number that displays at the top of the screen L1, L2, L3, L4, L5, or L6) To view graphs Press 2nd Press Y= (its just below your screen) Press enter Press enter Press the down arrow Use your left and right arrow keys to highlight the histogram or type of graph you want to see Once you’ve highlighted the correct graph press enter Press the down arrow the Xlist: should say the list number (to type in a list number you need to press 2nd then the number of the list Freq should be: 1
Press zoom Press 9 Press trace Use your left and right arrow keys to view the different values for the histogram or graph Min = is the lower limit Max < is the upper limit n = is the frequency for that class The 2 screens on your calculator will look like this: To clear a list:
Press stat Press 4 Press 2nd Press the number of the list you want to clearPress enter
|