Welcome to Statistics for Psychology. We imagine you to be like other stu-dents we have known who have taken this course. You have chosen tomajor in psychology or a related field because you are fascinated by people—by the visible behaviors of the people around you, perhaps too by their inner lives as well as by your own. Some of you are highly scientific sorts; others are more intuitive. Some of you are fond of math; others are less so, or even afraid of it. What- ever your style, we welcome you. We want to assure you that if you give this book some special attention (perhaps a little more than most textbooks require), you will learn statistics. The approach used in this book has successfully taught all sorts of stu- dents before you, including those who had taken statistics previously and done poorly. With this book and your instructor’s help, you will learn statistics and learn it well.

More importantly, we want to assure you that whatever your reason for studying psychology or a related field, this course is not a waste of time. Learning about statistics

✪ Summary 23

✪ Key Terms 24

✪ Example Worked-Out Problems 24

✪ Practice Problems 25

✪ Using SPSS 29

✪ Chapter Note 32

CHAPTER 1

IS B

N 0-

55 8-

46 76

1- X

Statistics for Psychology, Fifth Edition, by Arthur Aron, Elaine N. Aron, and Elliot J. Coups. Published by Prentice Hall. Copyright © 2009 by Pearson Education, Inc.

2 Chapter 1

helps you to read the work of other psychologists, to do your own research if you so choose, and to hone both your reasoning and intuition. Formally, statistics is a branch of mathematics that focuses on the organization, analysis, and interpretation of a group of numbers. But really what is statistics? Think of statistics as a tool that has evolved from a basic thinking process employed by every human: you observe a thing; you wonder what it means or what caused it; you have an insight or make an intuitive guess; you ob- serve again, but now in detail, or you try making little changes in the process to test your intuition. Then you face the eternal problem: was your hunch confirmed or not? What are the chances that what you observed this second time will happen again and again, so that you can announce your insight to the world as something probably true?

Statistics is a method of pursuing truth. As a minimum, statistics can tell you the likelihood that your hunch is true in this time and place and with these sorts of people. This pursuit of truth, or at least its future likelihood, is the essence of psychol- ogy, of science, and of human evolution. Think of the first research questions: what will the mammoths do next spring? What will happen if I eat this root? It is easy to see how the early accurate “researchers” survived. You are here today because your ancestors exercised brains as well as brawn. Do those who come after you the same favor: think carefully about outcomes. Statistics is one good way to do that.

Psychologists use statistical methods to help them make sense of the numbers they collect when conducting research. The issue of how to design good research is a topic in itself, summarized in a Web Chapter (Overview of the Logic and Language of Psychology Research) available on the Web site for this book http://www. pearsonhighered.com/. But in this text we confine ourselves to the statistical meth- ods for making sense of the data collected through research.

Psychologists usually use a computer and statistical software to carry out statis- tical procedures, such as the ones you will learn in this book. However, the best way to develop a solid understanding of statistics is to learn how to do the procedures by hand (with the help of a calculator). To minimize the amount of figuring you have to do, we use relatively small groups of numbers in each chapter’s examples and prac- tice problems. We hope that this will also allow you to focus more on the underlying principles and logic of the statistical procedure, rather than on the mathematics of each practice problem (such as subtracting 3 from 7 and then dividing the result by 2 to give an answer of 2). (See the Introduction to the Student on pp. xvi–xviii for more information on the goals of this book.) Having said that, we also recognize the importance of learning how to do statistical procedures on a computer, as you most likely would when conducting your own research. So, at the end of relevant chap- ters, there is a section called Using SPSS (see also the Study Guide and Computer Workbook that accompanies this text and that includes a guide to getting started with SPSS). SPSS statistical software is commonly used by psychologists and other behavioral and social scientists to carry out statistical analyses. Check with your instructor to see if you have access to SPSS at your institution.

The Two Branches of Statistical Methods There are two main branches of statistical methods.

1. Descriptive statistics: Psychologists use descriptive statistics to summarize and describe a group of numbers from a research study.

2. Inferential statistics: Psychologists use inferential statistics to draw conclu- sions and to make inferences that are based on the numbers from a research study but that go beyond the numbers. For example, inferential statistics allow researchers to make inferences about a large group of individuals based on a re- search study in which a much smaller number of individuals took part.

descriptive statistics procedures for summarizing a group of scores or other- wise making them more comprehensible.

inferential statistics procedures for drawing conclusions based on the scores collected in a research study but going beyond them.

statistics branch of mathematics that focuses on the organization, analysis, and interpretation of a group of numbers.

IS B

N 0-558-46761-X

Statistics for Psychology, Fifth Edition, by Arthur Aron, Elaine N. Aron, and Elliot J. Coups. Published by Prentice Hall. Copyright © 2009 by Pearson Education, Inc.

Displaying the Order in a Group of Numbers 3

variable characteristic that can have different values.

values possible number or category that a score can have.

score particular person’s value on a variable.

In this chapter and the next, we focus on descriptive statistics. This topic is im- portant in its own right, but it also prepares you to understand inferential statistics. Inferential statistics are the focus of the remainder of the book.

In this chapter we introduce you to some basic concepts, and then you will learn to use tables and graphs to describe a group of numbers. The purpose of descriptive statistics is to make a group of numbers easy to understand. As you will see, tables and graphs help a great deal.

Some Basic Concepts Variables, Values, and Scores As part of a larger study (Aron, Paris, & Aron, 1995), researchers gave a question- naire to students in an introductory statistics class during the first week of the course. One question asked was, “How stressed have you been in the last 21⁄2 weeks, on a scale of 0 to 10, with 0 being not at all stressed and 10 being as stressed as possible?” (How would you answer?) In this study, the researchers used a survey to examine students’ level of stress. Other methods that researchers use to study stress include measuring stress-related hormones in human blood or conducting controlled laboratory studies with animals.

In this example, level of stress is a variable, which can have values from 0 to 10, and the value of any particular person’s answer is the person’s score. If you answered 6, your score is 6; your score has a value of 6 on the variable called “level of stress.”

More formally, a variable is a condition or characteristic that can have different values. In short, it can vary. In our example, the variable was level of stress, which can have the values of 0 through 10. Height is a variable, social class is a variable, score on a creativity test is a variable, type of psychotherapy received by patients is a variable, speed on a reaction time test is a variable, number of people absent from work on a given day is a variable, and so forth.

A value is just a number, such as 4, –81, or 367.12. A value can also be a category, such as male or female, or a psychiatric diagnosis—major depression, post-traumatic stress disorder—and so forth.

Finally, on any variable, each person studied has a particular number or score that is his or her value on the variable. As we’ve said, your score on the stress vari- able might have a value of 6. Another student’s score might have a value of 8.

Psychology research is about variables, values, and scores (see Table 1–1). The formal definitions are a bit abstract, but in practice, the meaning usually is clear.

Levels of Measurement (Kinds of Variables) Most of the variables psychologists use are like those in the stress ratings example: the scores are numbers that tell you how much there is of what is being measured. In the stress ratings example, the higher the number is, the more stress there is. This is

Table 1–1 Some Basic Terminology

Term Definition Examples

Variable Condition or characteristic that can have different values Stress level, age, gender, religion

Value Number or category 0, 1, 2, 3, 4, 25, 85, female, Catholic

Score A particular person’s value on a variable 0, 1, 2, 3, 4, 25, 85, female, Catholic

IS B

N 0-

55 8-

46 76

1- X

Statistics for Psychology, Fifth Edition, by Arthur Aron, Elaine N. Aron, and Elliot J. Coups. Published by Prentice Hall. Copyright © 2009 by Pearson Education, Inc.

4 Chapter 1

equal-interval variable variable in which the numbers stand for approximately equal amounts of what is being measured.

ratio scale an equal-interval variable is measured on a ratio scale if it has an absolute zero point, meaning that the value of zero on the variable indicates a complete absence of the variable.

discrete variable variable that has specific values and that cannot have values between these specific values.

continuous variable variable for which, in theory, there are an infinite number of values between any two values.

an example of a numeric variable. Numeric variables are also called quantitative variables.

There are several kinds of numeric variables. In psychology research the most important distinction is between two types: equal-interval variables and rank-order variables. An equal-interval variable is a variable in which the numbers stand for approximately equal amounts of what is being measured. For example, grade point average (GPA) is a roughly equal-interval variable, since the difference between a GPA of 2.5 and 2.8 means about as much as the difference between a GPA of 3.0 and 3.3 (each is a difference of 0.3 of a GPA). Most psychologists also consider scales like the 0-to-10 stress ratings as roughly equal interval. So, for example, a difference between stress ratings of 4 and 6 means about as much as the difference between 7 and 9.

Some equal-interval variables are measured on what is called a ratio scale. An equal-interval variable is measured on a ratio scale if it has an absolute zero point. An absolute zero point means that the value of zero on the variable indicates a com- plete absence of the variable. Most counts or accumulations of things use a ratio scale. For example, the number of siblings a person has is measured on a ratio scale, because a zero value means having no siblings. With variables that are measured on a ratio scale, you can make statements about the difference in magnitude between values. So, we can say that a person with four siblings has twice as many siblings as a person with two siblings. However, most of the variables in psychology are not on a ratio scale.

Equal-interval variables can also be distinguished as being either discrete vari- ables or continuous variables. A discrete variable is one that has specific values and cannot have values between the specific values. The number of times you went to the dentist in the last 12 months is a discrete variable. You may have gone 0, 1, 2, 3, or more times, but you can’t have gone 1.72 times or 2.34 times. With a continuous variable, there are in theory an infinite number of values between any two values. So, even though we usually answer the question “How old are you?” with a specific age, such as 19 or 20, you could also answer it by saying that you are 19.26 years old. Height, weight, and time are examples of other continuous variables.

The other main type of numeric variable, a rank-order variable, is a variable in which the numbers stand only for relative ranking. (Rank-order variables are also called ordinal variables.) A student’s standing in his or her graduating class is an ex- ample. The amount of difference in underlying GPA between being second and third in class standing could be very unlike the amount of difference between being eighth and ninth.

A rank-order variable provides less information than an equal-interval variable. That is, the difference from one rank to the next doesn’t tell you the exact difference in amount of what is being measured. However, psychologists often use rank-order vari- ables because they are the only information available. Also, when people are being asked to rate something, it is sometimes easier and less arbitrary for them to make rank-order ratings. For example, when rating how much you like each of your friends, it may be easier to rank them by how much you like them than to rate your liking for them on a scale. Yet another reason researchers often use rank-order variables is that asking people to do rankings forces them to make distinctions. For example, if asked to rate how much you like each of your friends on a 1-to-10 scale, you might rate sev- eral of them at exactly the same level, but ranking would avoid such ties.

Another major type of variable used in psychology research, which is not a nu- meric variable at all, is a nominal variable in which the values are names or cate- gories. The term nominal comes from the idea that its values are names. (Nominal

rank-order variable numeric variable in which the values are ranks, such as class standing or place finished in a race. Also called ordinal variable.

numeric variable variable whose values are numbers (as opposed to a nominal variable). Also called quantita- tive variable.

nominal variable variable with values that are categories (that is, they are names rather than numbers). Also called categorical variable.

IS B

N 0-558-46761-X

Displaying the Order in a Group of Numbers 5

variables are also called categorical variables because their values are categories.) For example, for the nominal variable gender, the values are female and male. A per- son’s “score” on the variable gender is one of these two values. Another example is psychiatric diagnosis, which has values such as major depression, post-traumatic stress disorder, schizophrenia, and obsessive-compulsive disorder.

These different kinds of variables are based on different levels of measurement (see Table 1–2). Researchers sometimes have to decide how they will measure a par- ticular variable. For example, they might use an equal-interval scale, a rank-order scale, or a nominal scale. The level of measurement selected affects the type of sta- tistics that can be used with a variable. Suppose a researcher is studying the effects of a particular type of brain injury on being able to recognize objects. One approach the researcher might take would be to measure the number of different objects an injured person can observe at once. This is an example of an equal-interval level of measurement. Alternately, the researcher might rate people as able to observe no objects (rated 0), only one object at a time (rated 1), one object with a vague sense of other objects (rated 2), or ordinary vision (rated 3). This would be a rank-order approach. Finally, the researcher might divide people into those who are completely blind (rated B), those who can identify the location of an object but not what the ob- ject is (rated L), those who can identify what the object is but not locate it in space (rated I), those who can both locate and identify an object but have other abnormali- ties of object perception (rated O), and those with normal visual perception (rated N). This is a nominal level of measurement.

In this book, as in most psychology research, we focus mainly on numeric, equal-interval variables (or variables that roughly approximate equal-interval variables). We discuss statistical methods for working with nominal variables in Chapter 13 and methods for working with rank-order variables in Chapter 14.

levels of measurement types of underlying numerical information provided by a measure, such as equal- interval, rank-order, and nominal (categorical).

Table 1–2 Levels of Measurement

Level Definition Example

Equal-interval Numeric variable in which differences between values correspond to differences in the underlying thing being measured

Stress level, age

Rank-order Numeric variable in which values correspond to the relative position of things measured

Class standing, position finished in a race

Nominal Variable in which the values are categories Gender, religion

How are you doing?

1. A father rates his daughter as a 2 on a 7-point scale (from 1 to 7) of cranki- ness. In this example, (a) what is the variable, (b) what is the score, and (c) what is the range of values?

2. What is the difference between a numeric and a nominal variable? 3. What is the difference between a discrete and a continuous variable? 4. Give the level of measurement of each of the following variables: (a) a person’s

nationality (Mexican, Spanish, Ethiopian, Australian, etc.), (b) a person’s score on a standard IQ test, (c) a person’s place on a waiting list (first in line, second in line, etc.).

IS B

N 0-

55 8-

46 76

1- X

6 Chapter 1

BOX 1–1 Important Trivia for Poetic Statistics Students The word statistics comes from the Italian word statista, a person dealing with affairs of state (from stato, “state”). It was originally called “state arithmetic,” involving the tabulation of information about nations, especially for the purpose of taxation and planning the feasibility of wars.

Statistics were needed in ancient times to figure the odds of shipwrecks and piracy for marine insurance that would encourage voyages of commerce and exploration to far-flung places. The modern study of mortality rates and life insurance descended from the 17th-century plague pits—counting the bodies of persons cut down in the bloom of youth. The theory of errors (covered in Chapter 12) began in astronomy, that is, with stargazing; the theory of correlation (Chapter 11) has its roots in bi- ology, from the observation of parent and child differ- ences. Probability theory (Chapter 3) arose in the tense environs of the gambling table. The theory of analysis of experiments (Chapters 7 to 10) began in breweries and out among waving fields of wheat, where correct guesses determined not only the survival of a tasty beer but of thousands of marginal farmers. Theories of measurement and factor analysis (Chapter 15) derived from personality psychology, where the depths of human character were first explored with numbers. And chi-square (Chapter 13) came to us from sociology, where it was often a question of class.

In the early days of statistics, it was popular to use the new methods to prove the existence of God. For example, John Arbuthnot discovered that more male than female babies were born in London between 1629 and 1710. In

what is considered the first use of a statistical test, he proved that the male birthrate was higher than could be expected by chance (assuming that 50:50 was chance) and concluded that there was a plan operating, since males face more danger to obtain food for their families, and only God, he said, could do such planning.

In 1767, John Michell also used probability theory to prove the existence of God when he argued that the odds were 500,000 to 1 against six stars being placed as close together as those in the constellation Pleiades; so their placement had to have been a deliberate act of the Creator.

Statistics in the “state arithmetic” sense are legally en- dorsed by most governments today. For example, the first article of the U.S. Constitution requires a census. And statistics helped the United States win the Revolu- tionary War. John Adams obtained critical aid from Holland by pointing out certain vital statistics, carefully gathered by the clergy in local parishes, demonstrating that the colonies had doubled their population every 18 years, adding 20,000 fighting men per annum. “Is this the case of our enemy, Great Britain?” Adams wrote. “Which then can maintain the war the longest?”

Similar statistics were observed by U.S. President Thomas Jefferson in 1786. He wrote that his people “be- come uneasy” when there are more of them than 10 per square mile and that given the population growth of the new country, within 40 years these restless souls would fill up all of their country’s “vacant land.” Some 17 years later, Jefferson doubled the size of the United States’ “vacant” land through the Louisiana Purchase.

Answers

1.(a) crankiness, (b) 2, (c) 1 to 7. 2.A numeric variable has values that are numbers that tell you the degree or

extent of what the variable measures; a nominal variable has values that are different categories and have no particular numerical order.

3.A discrete variable has specific values and has no values between the spe- cific values. A continuous variable has, in theory, an infinite number of values between any two values.

4.(a) nominal, (b) equal-interval, (c) rank-order.

IS B

N 0-558-46761-X

Displaying the Order in a Group of Numbers 7

Frequency Tables An Example Let’s return to the stress ratings example. Recall that in this study, students in an in- troductory statistics class during the first week of the course answered the question, “How stressed have you been in the last 21⁄2 weeks, on a scale of 0 to 10, with 0 being not at all stressed and 10 being as stressed as possible?” The actual study included scores from 151 students. To ease the learning for this example, we are going to use a representative subset of scores from 30 of the 151 students (this also saves you time if you want to try it for yourself). The 30 students’ scores (their ratings on the scale) are:

8, 7, 4, 10, 8, 6, 8, 9, 9, 7, 3, 7, 6, 5, 0, 9, 10, 7, 7, 3, 6, 7, 5, 2, 1, 6, 7, 10, 8, 8.

Looking through all these scores gives some sense of the overall tendencies, but this is hardly an accurate method. One solution is to make a table showing how many stu- dents used each of the 11 values that the ratings can have (0, 1, 2, and so on, through 10). We have done this in Table 1–3. We also figured the percentage each value’s fre- quency is of the total number of scores. Tables like this sometimes give only the raw- number frequencies, not the percentages, or only the percentages and not the raw-number frequencies. In addition, some frequency tables include, for each value, the total number of scores with that value and all values preceding it. These are called cumulative frequencies because they tell how many scores are accumulated up to this pointon the table. Ifpercentagesareused,cumulativepercentagesalsomaybe included (for an example, see Figure 1–18 in the Using SPSS section on page 30). Cumulative percentages give, for each value, the percentage of scores up to and including that value. The cumulative percentage for any given value (or for a score having that value) is also called a percentile. Cumulative frequencies and cumulative percentages allow you to see where a particular score falls in the overall group of scores.

Table 1–3 is called a frequency table because it shows how frequently (how many times) each score was used. A frequency table makes the pattern of numbers easy to see. In this example, you can see that most of the students rated their stress level around 7 or 8, with few rating it very low.

How to Make a Frequency Table There are the four steps in making a frequency table.

❶ Make a list down the page of each possible value, from lowest to highest. In the stress ratings results, the list goes from 0, the lowest possible rating, up to 10, the highest possible rating.1 Note that even if one of the ratings between 0 and 10 is not used, you still include that value in the listing, showing it as hav- ing a frequency of 0. For example, if no one gives a stress rating of 2, you still include 2 as one of the values on the frequency table.

❷ Go one by one through the scores, making a mark for each next to its value on your list. This is shown in Figure 1–1.

❸ Make a table showing how many times each value on your list is used. That is, add up the number of marks beside each value.

❹ Figure the percentage of scores for each value. To do this, take the frequency for that value, divide it by the total number of scores, and multiply by 100. You may need to round off the percentage. We recommend that you round percent- ages to one decimal place. Note that because of the rounding, your percentages do not usually add up to exactly 100% (but they should be close).

Table 1–3 Frequency Table of Number of Students Rating Each Value of the Stress Scale

Stress Rating Frequency Percent

0 1 3.3

1 1 3.3

2 1 3.3

3 2 6.7

4 1 3.3

5 2 6.7

6 4 13.3

7 7 23.3

8 5 16.7

9 3 10.0

10 3 10.0

Source: Data based on Aron et al. (1995).

frequency table listing of number of individuals having each of the different values for a particular variable.

IS B

N 0-

55 8-

46 76

1- X

8 Chapter 1

Frequency Tables for Nominal Variables The preceding steps assume you are using numeric variables, the most common situa- tion. However, you can also use a frequency table to show the number of scores in each value (or category) of a nominal variable. For example, researchers (Aron, Aron, & Smollan, 1992) asked 208 students to name the closest person in their life.As shown in Table 1–4, 33 students selected a family member, 76 a nonromantic friend, 92 a roman- tic partner, and 7 selected some other person.Also in Table 1–4, the values listed on the left hand side of the frequency table are the values (the categories) of the variable.

Another Example Tracy McLaughlin-Volpe and her colleagues (2001) had 94 introductory psychology students keep a diary of their social interactions for a week during the regular semester. Each time a participant had a social interaction lasting 10 minutes or longer, he or she would fill out a card. The card had questions about various aspects of the conversation and the conversation partner. Excluding family and work situations, the number of so- cial interactions 10 minutes or longer over a week for these students were as follows:

48, 15, 33, 3, 21, 19, 17, 16, 44, 25, 30, 3, 5, 9, 35, 32, 26, 13, 14, 14, 47, 47, 18, 11, 5, 19, 24, 17, 6, 25, 8, 18, 29, 1, 18, 22, 3, 22, 29, 2, 6, 10, 29, 10, 29, 21, 38, 41, 16, 17, 8, 40, 8, 10, 18, 7, 4, 4, 8, 11, 3, 23, 10, 19, 21, 13, 12, 10, 4, 17, 11, 21, 9, 8, 7, 5, 3, 22, 14, 25, 4, 11, 10, 18, 1, 28, 27, 19, 24, 35, 9, 30, 8, 26.

Now, let’s follow our four steps for making a frequency table.

❶ Make a list down the page of each possible value, from lowest to highest. The lowest possible number of interactions is 0. In this study, the highest num- ber of interactions could be any number. However, the highest actual number in this group is 48; so we can use 48 as the highest value. Thus, the first step is to list these values down a page. (It might be good to use several columns so that you can have all the scores on a single page.)

❷ Go one by one through the scores, making a mark for each next to its value on your list. Figure 1–2 shows the results of this step.

❸ Make a table showing how many times each value on your list is used. Table 1–5 is the result.

8, 7, 4, 10, 8, 6, 8, 9, 9, 7, 3, 7, 6, 5, 0, 9, 10, 7, 7, 3, 6, 7, 5, 2, 1, 6, 7, 10, 8, 8

STRESS RATING FREQUENCY

0 1 2 3 4 5 6 7 8 9

10

Figure 1–1 Making a frequency table for the stress ratings scores. (Data based on Aron et al., 1995.)

0 – 1 – 2 – 3 – 4 – 5 – 6 – 7 – 8 – 9 –

10 – 11 – 12 – 13 – 14 – 15 – 16 –

17 – 18 – 19 – 20 – 21 – 22 – 23 – 24 – 25 – 26 – 27 – 28 – 29 – 30 – 31 – 32 – 33 –

34 – 35 – 36 – 37 – 38 – 39 – 40 – 41 – 42 – 43 – 44 – 45 – 46 – 47 – 48 –

Figure 1–2 Making a frequency table of students’ social interactions over a week. (Data from McLaughlin- Volpe et al., 2001.)

Table 1–4 Frequency Table for a Nominal Variable: Closest Person in Life for 208 Students

Closest Person Frequency Percent

Family member 33 15.9

Nonromantic friend 76 36.5

Romantic partner 92 44.2

Other 7 3.4

Source: Data from Aron et al. (1992).

T I P F O R S U C C E S S When doing Step ❷, cross off each score as you mark it on the list. This should help you avoid mistakes, which are common in this step.

IS B

N 0-558-46761-X

Displaying the Order in a Group of Numbers 9

❹ Figure the percentage of scores for each value. We have not done so in this example because it would not help much for seeing the pattern of scores. How- ever, if you want to check your understanding of this step, the first five percent- ages would be 0.0%, 2.1%, 1.1%, 5.3%, and 4.3%. (These are the percentages for frequencies of 0, 2, 1, 5, and 4, rounded to one decimal place.)

Grouped Frequency Tables Sometimes there are so many possible values that an ordinary frequency table is too awkward to give a simple picture of the scores. The last example was a bit like that, wasn’t it? The solution is to make groupings of values that include all values in a cer- tain range. Consider the stress ratings example. Instead of having a separate frequency figure for the group of students who rated their stress as 8 and another for those who rated it as 9, you could have a combined category of 8 and 9. This combined category is a range of values that includes these two values. A combined category like this is called an interval. This particular interval of 8 and 9 has a frequency of 8 (the 5 scores with a value of 8 plus the 3 scores with a value of 9).

A frequency table that uses intervals is called a grouped frequency table. Table 1–6 is a grouped frequency table for the stress ratings example. (Note that in this example the full frequency table has only 11 different values. Thus, a grouped frequency table is not really necessary.) Table 1–7 is a grouped frequency table for the 94 students’ number of social interactions over a week.

A grouped frequency table can make information even more directly understand- able than an ordinary frequency table can. Of course, the greater understandability of a grouped frequency table is at a cost. You lose some information: the details of the breakdown of frequencies in each interval.

interval range of values in a grouped frequency table that are grouped together. (For example, if the interval size is 10, one of the intervals might be from 10 to 19.)

grouped frequency table frequency table in which the number of individuals (frequency) is given for each interval of values.

T I P F O R S U C C E S S You can cross-check your work by adding the frequencies for all of the scores. This sum should equal the total number of scores you started with.

Table 1–5 Frequency Table for Number of Social Interactions During a Week for 94 College Students

Score Frequency Score Frequency Score Frequency

0 0 17 4 34 0

1 2 18 5 35 2

2 1 19 4 36 0

3 5 20 0 37 0

4 4 21 4 38 1

5 3 22 3 39 0

6 2 23 1 40 1

7 2 24 2 41 1

8 6 25 3 42 0

9 3 26 2 43 0

10 6 27 1 44 1

11 4 28 1 45 0

12 1 29 4 46 0

13 2 30 2 47 2

14 3 31 0 48 1

15 1 32 1

16 2 33 1

Source: Data from McLaughlin-Volpe et al., (2001).

Table 1–6 Grouped Frequency Table for Stress Ratings

Stress Rating Interval Frequency Percent

0–1 2 6.7

2–3 3 10.0

4–5 3 10.0

6–7 11 36.7

8–9 8 26.7

10–11 3 10.0

Source: Data based on Aron et al. (1995).

IS B

N 0-

55 8-

46 76

1- X

10 Chapter 1

When setting up a grouped frequency table, it makes a big difference how many intervals you use. There are guidelines to help researchers with this, but in practice it is done automatically by the researcher’s computer (see the Using SPSS section for in- structions on how to create frequency tables using statistical software). Thus, we will not focus on it in this book. However, should you have to make a grouped frequency table on your own, the key is to experiment with the interval size until you come up with one that is a round number (such as 2, 3, 5, or 10) and that creates about 5 to 15 in- tervals. Then, when actually setting up the table, be sure you set the start of each inter- val to a multiple of the interval size and the top end of each interval to the number just below the start of the next interval. For example, Table 1–6 uses six intervals with an interval size of 2. The intervals are 0–1, 2–3, 4–5, 6–7, 8–9, and 10–11. Note that each interval starts with a multiple of 2 (0, 2, 4, 6, 8, 10) and the top end of each interval (1, 3, 5, 7, 9) is the number just below the start of the next interval (2, 4, 6, 8, 10). Table 1–7 uses 10 intervals with an interval size of 5. The intervals are 0–4, 5–9, 10–14, 15–19, and so on, with a final interval of 45–49. Note that each interval starts with a multiple of 5 (0, 5, 10, 15, and so on) and that the top end of each interval (4, 9, 14, 19, and so on) is the number just below the start of the next interval (5, 10, 15, 20, and so on).

How are you doing?

1. What is a frequency table? 2. Why would a researcher want to make a frequency table? 3. Make a frequency table for the following scores: 5, 7, 4, 5, 6, 5, 4. 4. What does a grouped frequency table group?

Answers

1.A frequencytableisasystematiclistingofthenumberofscores(thefrequency) ofeachvalueinthegroupstudied.

2.A frequency table makes it easy to see the pattern in a large group of scores. 3.ValueFrequencyPercent

4228.6 5342.9 6114.3 7114.3

4.A frequency table groups the frequencies of adjacent values into intervals.

Histograms A graph is another good way to make a large group of scores easy to understand. A picture may be worth a thousand words, but it is sometimes worth a thousand num- bers. A straightforward approach is to make a graph of the frequency table. One kind of graph of the information in a frequency table is a kind of bar chart called a histogram. In a histogram, the height of each bar is the frequency of each value in the frequency table. Ordinarily, in a histogram all the bars are put next to each other with no space in between. The result is that a histogram looks a bit like a city skyline. Figure 1–3 shows two histograms based on the stress ratings example (one based on the ordinary frequency table and one based on the grouped frequency table). Figure 1–4 shows a histogram based on the grouped frequency table for the example of the numbers of students’ social interactions in a week.

Table 1–7 Grouped Frequency Table for Numbers of Social Interactions During a Week for 94 College Students

Interval Frequency Percent

0–4 12 12.8

5–9 16 17.0

10–14 16 17.0

15–19 16 17.0

20–24 10 10.6

25–29 11 11.7

30–34 4 4.3

35–39 3 3.2

40–44 3 3.2

45–49 3 3.2

Source: Data from McLaughlin-Volpe et al., (2001).

histogram barlike graph of a frequency distribution in which the values are plotted along the horizontal axis and the height of each bar is the frequency of that value; the bars are usually placed next to each other without spaces, giving the appearance of a city skyline.

IS B

N 0-558-46761-X

Displaying the Order in a Group of Numbers 11

7

6

5

4

3

2

1

0 0 1 2 3 4 5 6 7 8 9 10

Stress Rating

Fr eq

ue nc

y

Stress Rating

Fr eq

ue nc

y

11

10

9

8

7

6

5

4

3

2

1

0 1 3 5 7 9 11

STRESS RATING

0 1 2 3 4 5 6 7 8 9

10

FREQUENCY

(a) Frequency Table

1 1 1 2 1 2 4 7 5 3 3

STRESS RATING INTERVAL

0 – 1 2 – 3 4 – 5 6 – 7 8 – 9

10 – 11

FREQUENCY

(b) Grouped Frequency Table

2 3 3

11 8 3

❸

❶

❷

❸

❶

❷

❹

❹

Figure 1–3 Histograms based on (a) frequency table and (b) a grouped frequency for the stress ratings example. (Data based on Aron et al., 1995.)

IS B

N 0-

55 8-

46 76

1- X

12 Chapter 1

16

15

14

13

12

11

10

9

8

7

6

5

4

3

2

1

0 2.5 7.5 12.5 17.5 22.5 27.5 32.5 37.5 42.5 47.5

Number of Social Interactions

Fr eq

ue nc

y

FREQUENCYINTERVAL 12 16 16 16 10 11 4 3 3 3

0 – 4 5 – 9

10 – 14 15 – 19 20 – 24 25 – 29 30 – 34 35 – 39 40 – 44 45 – 49

Figure 1–4 Histogram for number of social interactions during a week for 94 college students based on grouped frequencies. (Data from McLaughlin-Volpe et al., 2001.)

BOX 1–2 Math Anxiety, Statistics Anxiety, and You: A Message for Those of You Who Are Truly Worried About This Course

Let’s face it: Many of you dread this course, even to the point of having a full-blown case of “statistics anxiety” (Zeidner, 1991). If you become tense the minute you see numbers, we need to talk about that right now.

First, this course is a chance for a fresh start with digits. Your past performance in (or avoidance of) geometry, trigonometry, calculus, or similar horrors need not influ- ence in any way how well you comprehend statistics. This is largely a different subject.

Second, if your worry persists, you need to determine where it is coming from. Math or statistics anxiety, test anxiety, general anxiety, and generally low self-confidence each seems to play its own role in students’ difficulties with math courses (Cooper & Robinson, 1989; Dwinell & Higbee, 1991).

Is your problem mainly math or statistics anxiety? An Internet search will yield hundreds of wonderful books and Web sites to help you. We highly recommend Sheila Tobias’s classics Overcoming Math Anxiety (1995) or Succeed with Math: Every Student’s Guide to Conquering Math Anxiety (1987). Tobias, a former math avoider her- self, suggests that your goal should be “math mental health,” which she defines as “the willingness to learn the

math you need when you need it” (1995, p. 12). (Could it be that this course in statistics is one of those times?)

Tobias explains that math mental health is usually lost in elementary school, when you are called to the black- board, your mind goes blank, and you are unable to pro- duce the one right answer to an arithmetic problem. What confidence remained after such an experience probably faded during timed tests, which you did not re- alize were difficult for everyone except the most profi- cient few.

Tobias says that students who are good at math are not necessarily smarter than the rest of us, but they really know their strengths and weaknesses, and they have indi- vidual styles of thinking and feeling their way around a problem. They do not judge themselves harshly for mis- takes. In particular, they do not expect to understand things instantly. Allowing yourself to be a “slow learner” does not mean that you are less intelligent. It shows that you are growing in math mental health.

Is your problem test anxiety? Test taking requires the use of the thinking part of our brain, the prefrontal cortex. When we are anxious, we naturally “downshift” to more basic, instinctual brain systems, and that effect ruins our

IS B

N 0-558-46761-X

Displaying the Order in a Group of Numbers 13

thinking ability. Anxiety produces arousal, and one of the best understood relationships in psychology is between arousal and performance. Whereas moderate arousal helps performance, too much or too little dramatically reduces it. In the case of too much, things you have learned become harder to recall. Your mind starts to race, creating more anxiety, more arousal, and so on. Because during a test you may be fearing that you are “no good and never will be,” it is important to rethink beforehand any poor grades you may have received in the past. They most likely reflected your problems with tests more than your abilities.

There are many ways to reduce anxiety and arousal in general, such as learning to breathe properly and to take a brief break to relax deeply. Your counseling center should be able to help you or direct you to some good books on the subject. Again, many Web sites deal with reducing anxiety.

Test anxiety specifically is first reduced by over- preparing for a few tests, so that you go in with the cer- tainty that you cannot possibly fail, no matter how aroused you become. The best time to begin applying this tactic is the first test of this course. There will be no old material to review, success will not depend on having understood previous material, and initial success will help you do well throughout the course. (You also might enlist the sympathy of your instructor or teaching assis- tant. Bring in a list of what you have studied, state why you are being so exacting, and ask if you have missed anything.) Your preparation must be ridiculously thor- ough, but only for a few exams. After these successes, your test anxiety should decline.

Also, create a practice test situation as similar to a real test as possible, making a special effort to duplicate the aspects that bother you most. If feeling rushed is the troubling part, once you think you are well prepared, set

yourself a time limit for solving some homework prob- lems. Make yourself write out answers fully and legibly. This may be part of what makes you feel slow during a test. If the presence of others bothers you—the sound of their scurrying pencils while yours is frozen in midair— do your practice test with others in your course. Even make it an explicit contest to see who can finish first.

Is your problem a general lack of confidence? Is something else in your life causing you to worry or feel bad about yourself? Then we suggest that it is time you tried your friendly college counseling center.

Lastly, could you be highly sensitive? A final word about anxiety and arousal. About 15 to 20% of humans (and all higher animals) seem to be born with a tempera- ment trait that has been seen traditionally as shyness, hesitancy, or introversion (Eysenck, 1981; Kagan, 1994). But this shyness or hesitancy seems actually due to a preference to observe and an ability to notice subtle stim- ulation and process information deeply (Aron, 1996; Aron & Aron, 1997). This often causes highly sensitive persons (HSPs) to be very intuitive or even gifted. But it also means they are more easily overaroused by high lev- els of stimulation, like tests.

You might want to find out if you are an HSP (at http://www.hsperson.com). If you are, appreciate the trait’s assets and make some allowances for its one dis- advantage, this tendency to become easily overaroused. It has to affect your performance on tests. What matters is what you actually know, which is probably quite a bit. This simple act of self-acceptance—that you are not less smart but are more sensitive—may in itself help ease your arousal when trying to express your statistical knowledge.

So good luck to all of you. We wish you the best while taking this course and in your lives.

How to Make a Histogram There are four steps in making a histogram.

❶ Make a frequency table (or grouped frequency table). ❷ Put the values along the bottom of the page, from left to right, from lowest

to highest. If you are making a histogram from a grouped frequency table, the values you put along the bottom of the page are the interval midpoints. The mid- point of an interval is halfway between the start of that interval and the start of the next highest interval. So, in Figure 1–4, the midpoint for the 0–4 interval is 2.5, because 2.5 is halfway between 0 (the start of the interval) and 5 (the start of the next highest interval). For the 5–9 interval, the midpoint is 7.5 because 7.5 is halfway between 5 (the start of the interval) and 10 (the start of the next highest interval). Do this for each interval. When you get to the last interval,

T I P F O R S U C C E S S Now try this yourself! Work out the interval midpoints for the grouped frequency table for the stress rat- ings example shown in Table 1–6. Your answers should be the same as the values shown along the bot- tom of Figure 1–3b.

IS B

N 0-

55 8-

46 76

1- X

find the midpoint between the start of the interval and the start of what would be the next highest interval. So, in Figure 1–4, the midpoint for the 45–49 interval is halfway between 45 (the start of the interval) and 50 (the start of what would be the next interval), which is 47.5.

❸ Make a scale of frequencies along the left edge of the page that goes from 0 at the bottom to the highest frequency for any value.

❹ Make a bar above each value with a height for the frequency of that value. For each bar, make sure that the middle of the bar is above its value.

When you have a nominal variable, the histogram is called a bar graph. Since the values of a nominal variable are not in any particular order, leave a space between the bars. Figure 1–5 shows a bar graph based on the frequency table in Table 1–4.

14 Chapter 1

100

90

80

70

60

50

40

30

20

10

0 Family member

Nonromantic friend

Romantic partner

Closest Person

Fr eq

ue nc

y

Other

Figure 1–5 Bar graph for the closest person in life for 208 students (see Table 1–4). (Data from Aron et al., 1995.)

T I P F O R S U C C E S S You will probably find it easier to make a histogram if you use graph paper.

How are you doing?

1. Why do researchers make frequency graphs? 2. When making a histogram from a frequency table, (a) what goes along the bot-

tom, (b) what goes along the left edge, and (c) what goes above each value? 3. Make a histogram based on the following frequency table:

Value Frequency 1 3 2 4 3 8 4 5 5 2

IS B

N 0-558-46761-X

Shapes of Frequency Distributions A frequency distribution shows the pattern of frequencies over the various values. A frequency table or histogram describes a frequency distribution because each shows the pattern or shape of how the frequencies are spread out, or “distributed.”

Psychologists also describe this shape in words. Describing the shape of a distri- bution is important both in the descriptive statistics of this chapter and the next and in the inferential statistics of later chapters.

Unimodal and Bimodal Frequency Distributions One question is whether a distribution’s shape has only one main high point: one high “tower” in the histogram. For example, in the stress ratings study, the most frequent value is 7, giving a graph only one very high area. This is a unimodal distribution. If a distribution has two fairly equal high points, it is a bimodal distribution. Any distrib- ution with two or more high points is called a multimodal distribution. (Strictly speaking, a distribution is bimodal or multimodal only if the peaks are exactly equal. However, psychologists use these terms more informally to describe the general shape.) Finally, a distribution with values of all about the same frequency is a rectangular distribution. Figure 1–7 shows examples of these frequency distribution shapes. As you will see, the graphs in Figure 1–7 are not histograms, but special line graphs called frequency polygons, which are another way to graph a frequency table. In a frequency polygon, the line moves from point to point. The height of each point shows the number of scores with that value. This creates a mountain peak skyline.

Displaying the Order in a Group of Numbers 15

rectangular distribution frequency distribution in which all values have approximately the same frequency.

unimodal distribution frequency distribution with one value clearly hav- ing a larger frequency than any other.

bimodal distribution frequency distribution with two approximately equal frequencies, each clearly larger than any of the others.

multimodal distribution frequency distribution with two or more high frequencies separated by a lower fre- quency; a bimodal distribution is the special case of two high frequencies.

frequency distribution pattern of frequencies over the various values; what a frequency table, histogram, or frequency polygon describes.

Answers

1.Researchers make frequency graphs to show the pattern visually in a fre- quency table.

2.(a) The values, from lowest to highest go along the bottom; (b) the frequencies from 0 at the bottom to the highest frequency of any value at the top go along the left; (c) above each value is a bar with a height of the frequency for that value.

3.See Figure 1–6.

Figure 1–6 Histogram for “How Are You Doing?” question 3.

0

1

2

3

4

5

6

7

8

123

Value

Frequency

456

(a) Unimodal

(b) Approximately Bimodal

(c) Approximately Rectangular

Figure 1–7 Examples of (a) unimodal, (b) approximately bimodal, and (c) approximately rectangular frequency polygons.

IS B

N 0-

55 8-

46 76

1- X

16 Chapter 1

Age

(a) Number of People in a Toddler’s Play Area

1 2 3 4 5 6

Grade Level

(b) Number

of Students

Figure 1–8 Fictional examples of distributions that are not unimodal: (a) A bimodal distribution showing the possible frequencies for people of different ages in a toddler’s play area. (b) A regular distribution showing the possible frequencies of students at different grade levels in an elementary school.

The scores from most psychology studies are usually an approximately unimodal distribution. Bimodal and other multimodal distributions occasionally turn up. A bi- modal example is the distribution of the ages of people in a toddler’s play area in a park, who are mostly either toddlers with ages of around 2 to 4 or caretakers with ages of 20 to 40 or so (with few people aged 5 to 19 years or above 40). Thus, if you make a frequency distribution of these ages, the large frequencies are at the values for low ages (2 to 4) and for higher ages (20 to 40 or so). An example of a rectangular distri- bution is the number of children at each grade level at an elementary school; there is about the same number in first grade, second grade, and so on. Figure 1–8 shows these examples.

Symmetrical and Skewed Distributions Look again at the histograms of the stress ratings example (Figure 1–3). The distribu- tion is lopsided, with more scores near the high end. This is somewhat unusual. Most things we measure in psychology have about equal numbers on both sides of the mid- dle. That is, most of the time in psychology, the scores follow an approximately

IS B

N 0-558-46761-X

symmetrical distribution (if you fold the graph of a symmetrical distribution in half, the two halves look the same).

A distribution that clearly is not symmetrical is called a skewed distribution. The stress ratings distribution is an example. A skewed distribution has one side that is long and spread out, somewhat like a tail. The side with the fewer scores (the side that looks like a tail) is considered the direction of the skew. Thus, the stress study example, which has too few scores at the low end, is skewed to the left. However, the social interactions example, which has too few scores at the high end, is skewed to the right (see Figure 1–4). Figure 1–9 shows examples of approximately symmetrical and skewed distributions.

A distribution that is skewed to the right is also called positively skewed. A dis- tribution skewed to the left is also called negatively skewed.

Strongly skewed distributions come up in psychology research mainly when what is being measured has some upper or lower limit. For example, a family cannot have fewer than zero children. When many scores pile up at the low end because it is impossible to have a lower score, the result is called a floor effect. A skewed distri- bution caused by a lower limit is shown in Figure 1–10a.

Displaying the Order in a Group of Numbers 17

symmetrical distribution distribution in which the pattern of frequencies on the left and right side are mirror images of each other.

skewed distribution distribution in which the scores pile up on one side of the middle and are spread out on the other side; distribution that is not symmetrical.

T I P F O R S U C C E S S It helps you remember the direc- tion of the skew to know that the word skew comes from the French queue, which means line or tail. Thus, the direction of the skew is the side that has the long line, or tail.

(c)(b)(a)

Figure 1–9 Examples of frequency polygons of distributions that are (a) approximately symmet- rical, (b) skewed to the right (positively skewed), and (c) skewed to the left (negatively skewed).

11

10

9

8

7

6

5

4

3

2

1

0 0 10 20 30 40 50 60 70 80 90

Percentage of Correct Answers

Pa rt

ic ip

an ts

1000 1 2 3 4 5 6

N um

be r

of Fa

m ili

es

(a) (b)

Number of Children

Figure 1–10 (a) A distribution skewed to the right due to a floor effect: fictional distribution of the number of children in families. (b) A distribution skewed to the left due to a ceiling effect: fictional distri- bution of adults’ scores on a multiplication table test.

floor effect situation in which many scores pile up at the low end of a distrib- ution (creating skewness) because it is not possible to have any lower score.

IS B

N 0-

55 8-

46 76

1- X

18 Chapter 1

ceiling effect situation in which many scores pile up at the high end of a distri- bution (creating skewness) because it is not possible to have a higher score.

normal curve specific, mathematically defined, bell-shaped frequency distribu- tion that is symmetrical and unimodal; distributions observed in nature and in research commonly approximate it.

A skewed distribution caused by an upper limit is shown in Figure 1–10b. This is a distribution of adults’ scores on a multiplication table test. This distribu- tion is strongly skewed to the left. Most of the scores pile up at the right, the high end (a perfect score). This shows a ceiling effect. The stress ratings example also shows a mild ceiling effect because many students had high levels of stress, the maximum rating was 10, and people often do not like to use ratings right at the maximum.

Normal and Kurtotic Distributions Psychologists also describe a distribution in terms of whether the middle of the distribution is particularly peaked or flat. The standard of comparison is a bell- shaped curve. In psychology research and in nature generally, distributions often are similar to this bell-shaped standard, called the normal curve. We discuss this curve in some detail in later chapters. For now, however, the important thing is that the normal curve is a unimodal, symmetrical curve with an average peak—the sort of bell shape shown in Figure 1–11a. Both the stress ratings and the social interac- tions examples approximate a normal curve in a very general way—although, as we noted, both are somewhat skewed. In our experience, most distributions that result from psychology research are closer to the normal curve than are these two examples.

Kurtosis is how much the shape of a distribution differs from a normal curve in terms of whether its curve in the middle is more peaked or flat than the normal curve (DeCarlo, 1997). Kurtosis comes from the Greek word kyrtos, “curve.” Figure 1–11b shows a kurtotic distribution with a more extreme peak than the nor- mal curve. Figure 1–11c shows an extreme example of a kurtotic distribution, one with a very flat distribution. (A rectangular distribution would be even more extreme.)

Distributions that are more peaked or flat than a normal curve also tend to have a different shape in the tails. Those with a very peaked curve usually have more scores in the tails of the distribution than the normal curve (see Fig- ure 1–11b). It is as if the normal curve got pinched in the middle and some of it went up into a sharp peak and the rest spread out into thick tails. Distributions with a flatter curve usually have fewer scores in the tails of the distribution than the nor- mal curve (see Figure 1–11c). It is as if the tails and the top of the curve both got sucked in toward the middle on both sides. Although it is often easiest to identify kurtosis in terms of how peaked or flat the distribution is, the number of scores in the tails is what matters.

kurtosis extent to which a frequency distribution deviates from a normal curve in terms of whether its curve in the middle is more peaked or flat than the normal curve.

(b) (c)(a)

Figure 1–11 Examples of (a) normal, (b) peaked, and (c) flat distribution. The normal distri- bution is shown as a dashed line in (b) and (c).

Source: Adapted from DeCarlo, T. (1997). On the meaning and use of kurtosis. Psychological Methods, 3, 292–307, Figure 1. Published by the American Psychological Association. Adapted with permission.

IS B

N 0-558-46761-X

Displaying the Order in a Group of Numbers 19

How are you doing?

1. Describe the difference between a unimodal and multimodal distribution in terms of (a) a frequency graph and (b) a frequency table.

2. What does it mean to say that a distribution is skewed to the left? 3. What kind of skew is created by (a) a floor effect and (b) a ceiling effect? 4. When a distribution is described as being peaked or flat, what is it being

compared to?

Answers

1.(a) A unimodal distribution has one main high point; a multimodal distribution has more than one main high point. (b) A unimodal distribution has one value with a higher frequency than all the other frequencies; a multimodal distribu- tion has more than one value with large frequencies compared to the values around it.

2.Fewer scores have low values than have high values. 3.(a) A skew created by a floor effect is skewed to the right; (b) one created by a

ceiling effect is skewed to the left. 4.The distribution is being compared to a normal curve.

Controversy: Misleading Graphs The most serious controversy about frequency tables and histograms is not among psychologists, but among the general public. The misuse of these procedures by some public figures, advertisers, and the media seems to have created skepticism about the trustworthiness of statistics in general and of statistical tables and charts in particular. Everyone has heard that “statistics lie.”

Of course, people can and do lie with statistics. It is just as easy to lie with words, but you may be less sure of your ability to recognize lies with numbers. In this section, we note two ways in which frequency tables and graphs can be misused and tell how to recognize such misuses. (Much of this material is based on the clas- sic discussion of these issues in Tufte, 1983.)

Failure to Use Equal Interval Sizes A key requirement of a grouped frequency table or graph is that the size of the inter- vals be equal. If they are not equal, the table or graph can be very misleading. Tufte (1983) gives an example, shown in Figure 1–12, from the respectable (and usually ac- curate) New York Times. This chart gives the impression that commissions paid to travel agents dropped dramatically in 1978. However, a close reading of the graph shows that the third bar for each airline is for only the first half of 1978. Thus, only half a year is being compared to each of the preceding full years. Assuming that the second half of 1978 was like the first half, the information in this graph actually tells us that 1978 shows an increase rather than a decrease. For example, Delta Airlines estimated a full-year 1978 figure of $72 million, much higher than 1977’s $57 million.

Exaggeration of Proportions The height of a histogram or bar graph (or frequency polygon) usually begins at 0 or the lowest value of the scale and continues to the highest value of the scale. Figure 1–13a shows a bar graph that does not follow this standard. The bar graph shows the mean

IS B

N 0-

55 8-

46 76

1- X

20 Chapter 1

$109

$100

Commission Payments to Travel Agents in millions of dollars

$64

$102

$84

$52

TWA

$79

$70

$50

EASTERN

$57

’77

$50

’76

First Half ’78

$36

DELTA

UNITED AIRLINES

Figure 1–12 Misleading illustration of a frequency distribution due to unequal interval sizes.

Source: “Commission Payments to Travel Agents,” From The New York Times, August 8, 1978. © 1978 The New York Times. Used by permission and protected by the Copyright Laws of the United States. The printing, copying, redistribution, or retransmission of the Material without express written permission is prohibited. www.nytimes.com

150,000

M ea

n H

ou se

P ri

ce (

$)

152,000

154,000

156,000

158,000

160,000

162,000

164,000

2004 2005 2006

Year Year

2007 0

M ea

n H

ou se

P ri

ce (

$)

25,000

50,000

75,000

100,000

125,000

150,000

175,000

(a) (b)

2004 2005 2006 2007

Figure 1–13 Misleading bar graph due to not starting at zero. The vertical axis starts at $150,000 for figure (a) compared to $0 for figure (b).

IS B

N 0-558-46761-X

Displaying the Order in a Group of Numbers 21

housing price in a particular region over a 4-year period (from 2004 to 2007). By start- ing the vertical axis at $150,000 (instead of 0, as is customary), the graph appears to ex- aggerate the changes in housing price over time. Figure 1–13b shows the same results with the vertical axis starting at $0. You can still see the changes in housing price from year to year in Figure 1–13b but the figure does a better job of showing the size of those changes.

The overall proportion of a histogram or bar graph should be about 1 to 1.5 times as wide as it is tall, as in Figure 1–14a for the stress ratings example. But look what happens if we make the graph much taller or shorter, as shown in Fig- ures 1–14b and 1–14c. The effect is like that of a fun house mirror: the true picture is distorted. Any particular shape is in a sense accurate. But the 1-to-1.5 proportion has been adopted to give people a standard for comparison. Changing this proportion misleads the eye.

Frequency Tables and Histograms in Research Articles Psychology researchers mainly use frequency tables and histograms as a first step in more elaborate statistical analyses. They are usually not included in research articles, and when they are, just because they are so rare, they are often not standard in some way. When they do appear, they are most likely to be in survey studies. For example, Raskauskas and Stoltz (2007) asked a group of 84 adolescents about their in- volvement in traditional and electronic bullying. The researchers defined electronic

Stress Rating

Fr eq

ue nc

y 10

8

6

4

2

0 1 3 5 7 9 11

(a)

Stress Rating

Fr eq

ue nc

y

1 3 5 7 9 11

(b)

10

8

6

4

2

0

Stress Rating

Fr eq

ue nc

y

1 3 5 7 9 11

(c)

10

8

6

4

2

0

Figure 1–14 Histogram of students’ stress ratings distorted from the standard of width 1 to 1.5 times height. (Data based on Aron et al., 1995.)

IS B

N 0-

55 8-

46 76

1- X

22 Chapter 1

bullying as “. . . a means of bullying in which peers use electronics [such as text mes- sages, emails, and defaming Web sites] to taunt, threaten, harass, and/or intimidate a peer” (p. 565). Table 1–8 is a frequency table showing the adolescents’ reported in- cidence of being victims or perpetrators of traditional and electronic bullying. The table shows, for example, that about half (48.8%) of the adolescents reported being the victim of electronic bullying, and the most common vehicle for electronic bully- ing (experienced by 32.1% of the adolescents) was text messaging.

Histograms are even more rare in research articles (except in articles about statistics), but they do appear occasionally. Maggi and colleagues (2007) conducted a study of age-related changes in cigarette smoking behaviors in Canadian adolescents. As shown in Figure 1–15, they created a histogram—from a grouped frequency table—to display their results. Their histogram shows the results from the two sam- ples they studied (one shown in the light colored bars and the other in the dark colored bars). As you can see in the figure, less than 10% of the 10- and 11-year-olds reported that they had tried smoking, but more than half of the 16- and 17-year-olds said they had tried smoking. As already mentioned, such figures are often not standard in some way. In this example, the researchers drew the histogram with gaps between the bars, whereas it is standard not to use gaps (unless you are drawing a bar graph for a nom- inal variable). However, the histogram still does a good job of showing the distribu- tion. Also, the researchers, to allow for a fair comparison of how the rate of smoking differed among adolescents of varying ages, plotted the percentage of adolescents on the vertical axis instead of the actual number of adolescents. (Plotting the actual num- ber of adolescents who reported smoking would have been misleading, because there were not the same number of individuals in each of the age groups.)

Table 1–8 Incidence of Traditional and Electronic Bullying and Victimization (N�84)

Form of bullying N %

Electronic victims 41 48.8

Text-message victim 27 32.1

Internet victim (Web sites, chatrooms) 13 15.5

Picture-phone victim 8 9.5

Traditional victims 60 71.4

Physical victim 38 45.2

Teasing victim 50 59.5

Rumors victim 32 38.6

Exclusion victim 30 50.0

Electronic bullies 18 21.4

Text-message bully 18 21.4

Internet bully 11 13.1

Traditional bullies 54 64.3

Physical bully 29 34.5

Teasing bully 38 45.2

Rumor bully 22 26.2

Exclusion bully 35 41.7

Source: Raskauskas, J., & Stoltz, A. D. (2007). Involvement in traditional and electronic bullying among adolescents. Developmental Psychology, 43, 564–575. Published by the American Psychological Association. Reprinted with permission.

IS B

N 0-558-46761-X

Displaying the Order in a Group of Numbers 23

10–11 0

20

40

60

80

100

12–13

Age in Years

A do

le sc

en ts

W ho

T ri

ed S

m ok

in g

(% )

14–15 16–17

Figure 1–15 Change in the percentage of adolescents surveyed in the Canadian National Longitudinal Survey of Children and Youth longitudinal sample.

Source: Maggi, S., Hertzman, C., & Vaillancourt, T. (2007). Changes in smoking behaviors from late childhood to adolescence: Insights from the Canadian National Longitudinal Survey of Children and Youth. Health Psychology, 26, 232–240. Published by the American Psychological Association. Reprinted with permission.

Summary

1. Psychologists use descriptive statistics to describe and summarize a group of numbers from a research study.

2. A value is a number or category; a variable is a characteristic that can have dif- ferent values; a score is a particular person’s value on the variable.

3. Most variables in psychology research are numeric with approximately equal intervals. However, some numeric variables are rank-ordered (the values are ranks), and some variables are not numeric at all (the values are categories).

4. A frequency table organizes the scores into a table of each of the possible values with the frequency and percentage of scores with that value.

5. When there are many different values, a grouped frequency table is useful. It is like an ordinary frequency table except that the frequencies are given for inter- vals that include a range of values.

6. The pattern of frequencies in a distribution can be shown visually with a his- togram (or bar graph), in which the height of each bar is the frequency for a particular value.

7. The general shape of a histogram can be unimodal (having a single peak), bimodal (having two peaks), multimodal (including bimodal), or rectangular (having no peak); it can be symmetrical or skewed (having a long tail) to the right or the left; and, compared to the bell-shaped normal curve, it can be kurtotic (having a peaked or flat distribution).

IS B

N 0-

55 8-

46 76

1- X

24 Chapter 1

8. Statistical graphs for the general public are sometimes distorted in ways that mislead the eye, such as failing to use equal intervals or exaggerating proportions.

9. Frequency tables and histograms are rarely shown in research articles. When they are, they often follow nonstandard formats or involve frequencies (or per- centages) for a nominal variable. The shapes of distributions are more often described.

10

10

20

10

20

30

1

1

2

1

2

3

1

2

3

4

5

6

Interest in Graduate School Frequency Percent

2, 4, 5, 5, 1, 3, 6, 3, 6, 6

❹ ❸

❷

❶

Figure 1–16 Answer to Example Worked-Out Problem for making a frequency table. ❶ Make a list down the page of each possible value, from lowest to highest. ❷ Go one by one through the scores, making a mark for each next to its value on your list. ❸ Make a table showing how many times each value on your list is used. ❹ Figure the percentage of scores for each value.

Key Terms

statistics (p. 2) descriptive statistics (p. 2) inferential statistics (p. 2) variable (p. 3) values (p. 3) score (p. 3) numeric variable (p. 4) equal-interval variable (p. 4) ratio scale (p. 4) discrete variable (p. 4)

continuous variable (p. 4) rank-order variable (p. 4) nominal variable (p. 4) levels of measurement (p. 5) frequency table (p. 7) interval (p. 9) grouped frequency table (p. 9) histogram (p. 10) frequency distribution (p. 15) unimodal distribution (p. 15)

bimodal distribution (p. 15) multimodal distribution (p. 15) rectangular distribution (p. 15) symmetrical distribution (p. 17) skewed distribution (p. 17) floor effect (p. 17) ceiling effect (p. 18) normal curve (p. 18) kurtosis (p. 18)

Example Worked-Out Problems

Ten first-year students rated their interest in graduate school on a scale from 1 � no in- terest at all to 6 � high interest. Their scores were as follows: 2, 4, 5, 5, 1, 3, 6, 3, 6, 6.

Making a Frequency Table See Figure 1–16.

IS B

N 0-558-46761-X

Displaying the Order in a Group of Numbers 25

Making a Histogram See Figure 1–17.

3

2

1

0

1 2 3 4 5 6

Interest in Graduate School Frequency

1

2

3

4

5

6

1

1

2

1

2

3

❶

❸

❷

❹

Figure 1–17 Answer to Worked-Out Problem for making a histogram. ❶ Make a frequency table (or grouped frequency table). ❷ Put the values along the bottom of the page, from left to right, from lowest to highest. ❸ Make a scale of frequencies along the left edge of the page that goes from 0 at the bottom to the highest frequency for any value. ❹ Make a bar above each value with a height for the frequency of that value.

Practice Problems

These problems involve tabulation and making graphs. Most real-life statistics prob- lems are done on a computer with special statistical software. Even if you have such software, do these problems by hand to ingrain the method in your mind. To learn how to use a computer to solve statistics problems like those in this chapter, refer to the Using SPSS section at the end of this chapter and the Study Guide and Computer Workbook that accompanies this text.

All data are fictional unless an actual citation is given.

Set I (for Answers to Set I Problems, see pp. 673–674) 1. A client rates her satisfaction with her vocational counselor as a 3 on a 4-point

scale from 1 � not at all satisfied to 4 � very satisfied. What is the (a) variable, (b) possible values, and (c) this client’s score?

2. Give the level of measurement for each of the following variables: (a) ethnic group to which a person belongs, (b) number of times a mouse makes a wrong turn in a laboratory maze, and (c) position one finishes in a race.

3. A particular block in a suburban neighborhood has 20 households. The number of children in these households is as follows:

2, 4, 2, 1, 0, 3, 6, 0, 1, 1, 2, 3, 2, 0, 1, 2, 1, 0, 2, 2

Make (a) a frequency table and (b) a histogram. Then (c) describe the general shape of the distribution.

4. Fifty students were asked how many hours they studied this weekend. Here are their answers:

11, 2, 0, 13, 5, 7, 1, 8, 12, 11, 7, 8, 9, 10, 7, 4, 6, 10, 4, 7, 8, 6, 7, 10, 7, 3, 11, 18, 2, 9, 7, 3, 8, 7, 3, 13, 9, 8, 7, 7, 10, 4, 15, 3, 5, 6, 9, 7, 10, 6

IS B

N 0-

55 8-

46 76

1- X

26 Chapter 1

Make (a) a frequency table and (b) a histogram. Then (c) describe the general shape of the distribution.

5. These are the scores on a test of sensitivity to smell taken by 25 chefs attending a national conference:

96, 83, 59, 64, 73, 74, 80, 68, 87, 67, 64, 92, 76, 71, 68, 50, 85, 75, 81, 70, 76, 91, 69, 83, 75

Make (a) a frequency table and (b) histogram. (c) Make a grouped frequency table using intervals of 50–59, 60–69, 70–79, 80–89, and 90–99. Based on the grouped frequency table, (d) make a histogram and (e) describe the general shape of the distribution.

6. The following data are the number of minutes it took each of a group of 34 10-year-olds to do a series of abstract puzzles:

24, 83, 36, 22, 81, 39, 60, 62, 38, 66, 38, 36, 45, 20, 20, 67, 41, 87, 41, 82, 35, 82, 28, 80, 80, 68, 40, 27, 43, 80, 31, 89, 83, 24

Make (a) a frequency table and (b) a grouped frequency table using intervals of 20–29, 30–39, 40–49, 50–59, 60–69, 70–79, and 80–89. Based on the grouped frequency table, (c) make a histogram and (d) describe the general shape of the distribution.

7. Describe the shapes of the three distributions illustrated.

(b)(a)

(c)

8. Draw an example of each of the following distributions: (a) symmetrical, (b) rectangular, and (c) skewed to the right.

9. Explain to a person who has never had a course in statistics what is meant by (a) a symmetrical unimodal distribution and (b) a negatively skewed unimodal distribution. (Be sure to include in your first answer an explanation of what “distribution” means.)

10. McKee and Ptacek (2001) asked 90 college students about a time they had deliv- ered bad news to someone. Table 1–9 shows the results for the type of bad news given. (a) Using this table as an example, explain the idea of a frequency table to a person who has never had a course in statistics. (b) Explain the general mean- ing of the pattern of results.

Set II 11. A participant in a cognitive psychology study is given 50 words to remember

and later asked to recall as many as he can of them. This participant recalls 17. What is the (a) variable, (b) possible values, and (c) score?

IS B

N 0-558-46761-X

Displaying the Order in a Group of Numbers 27

12. Explain and give an example for each of the following types of variables: (a) equal-interval, (b) rank-order, (c) nominal, (d) ratio scale, (e) continuous.

13. An organizational psychologist asks 20 employees in a company to rate their job satisfaction on a 5-point scale from 1� very unsatisfied to 5� very satisfied. The ratings are as follows:

3, 2, 3, 4, 1, 3, 3, 4, 5, 2, 3, 5, 2, 3, 3, 4, 1, 3, 2, 4

Make (a) a frequency table and (b) a histogram. Then (c) describe the general shape of the distribution.

14. A social psychologist asked 15 college students how many times they “fell in love” before they were 11 years old. The numbers of times were as follows:

2, 0, 6, 0, 3, 1, 0, 4, 9, 0, 5, 6, 1, 0, 2

15. Following are the speeds of 40 cars clocked by radar on a particular road in a 35-mph zone on a particular afternoon:

30, 36, 42, 36, 30, 52, 36, 34, 36, 33, 30, 32, 35, 32, 37, 34, 36, 31, 35, 20, 24, 46, 23, 31, 32, 45, 34, 37, 28, 40, 34, 38, 40, 52, 31, 33, 15, 27, 36, 40

16. Here are the number of holiday gifts purchased by 25 families randomly inter- viewed at a local mall at the end of the holiday season:

22, 18, 22, 26, 19, 14, 23, 27, 2, 18, 28, 28, 11, 16, 34, 28, 13, 21, 32, 17, 6, 29, 23, 22, 19

Make (a) a frequency table and (b) a grouped frequency table using intervals of 0–4, 5–9, 10–14, 15–19, 20–24, 25–29, and 30–34. Based on the grouped frequency table, (c) make a histogram and (d) describe the general shape of the distribution.

Table 1–9 Descriptive Statistics for the Type of News Given

Category Frequency Percentage

1. Relationship with family 19 21.1

2. School 1 1.1

3. Job/work 6 6.7

4. Relationship with actual/potential girlfriend/boyfriend 17 18.9

5. Personal health 1 1.1

6. Finance 1 1.1

7. Relationship with friends 21 23.3

8. Health of family member/friend 23 25.6

9. Other 1 1.1

Source: McKee, T. L. E., & Placek, J. T. (2001). I’m afraid I have something bad to tell you: Breaking bad news from the perspective of the given. Journal of Applied Social Psychology, 31, 246–273. Copyright © 2001 by Blackwell Publishing. Reprinted by permission of Blackwell Publishers Journals.

IS B

N 0-

55 8-

46 76

1- X

28 Chapter 1

17. Pick a book and a page number of your choice. (Select a page with at least 30 lines; do not pick a textbook or any book with tables or illustrations.) Make a list of the number of words on each line; use that list as your data set. Make (a) a frequency table and (b) a histogram. Then (c) describe the general shape of the distribution. (Be sure to give the name, author, publisher, and year of the book you used, along with the page number, with your answer.)

18. Explain to a person who has never taken a course in statistics the meaning of a grouped frequency table.

19. Give an example of something having these distribution shapes: (a) bimodal, (b) approximately rectangular, and (c) positively skewed. Do not use an exam- ple given in this book or in class.

20. Find an example in a newspaper or magazine of a graph that misleads by failing to use equal interval sizes or by exaggerating proportions.

21. Nownes (2000) surveyed representatives of interest groups who were registered as lobbyists of three U.S. state legislatures. One of the issues he studied was whether interest groups are in competition with each other. Table 1–10 shows the results for one such question. (a) Using this table as an example, explain the idea of a frequency table to a person who has never had a course in statistics. (b) Explain the general meaning of the pattern of results.

22. Mouradian (2001) surveyed college students selected from a screening session to include two groups: (a) “Perpetrators”—students who reported at least one vio- lent act (hitting, shoving, etc.) against their partner in their current or most recent relationship—and (b) “Comparisons”—students who did not report any such uses of violence in any of their last three relationships. At the actual testing ses- sion, the students first read a description of an aggressive behavior such as, “Throw something at his or her partner” or “Say something to upset his or her partner.” They then were asked to write “as many examples of circumstances of situations as [they could] in which a person might engage in behaviors or acts of this sort with or towards their significant other.” Table 1–11 shows the “Dominant Category of Explanation” (the category a participant used most) for females and males, broken down by comparisons and perpetrators. (a) Using this table as an example, explain the idea of a frequency table to a person who has never had a course in statistics. (b) Explain the general meaning of the pattern of results.

Table 1–10 Competition for Members and Other Resources

Question: How much competition does this group face from other groups with similar goals

for members and other resources?

Answer Percentage Number

No competition 20 118

Some competition 58 342

A lot of competition 22 131

Total 100 591

Note: There were no statistically significant differences between states. For full results of significance tests, contact the author. Source: Nownes, A. J. (2001). Policy conflict and the structure of interest communities. American Politics Quarterly, 28, 316. Copyright © 2001 by Sage Publications, Ltd. Reprinted by permission of Sage Publications, Thousands Oaks, London, and New Delhi.

IS B

N 0-558-46761-X

Displaying the Order in a Group of Numbers 29

Table 1–11 Dominant Category of Explanation for Intimate Aggression by Gender and Perpetrator Status

Group

Female Male

Comparisons Perpetrators Comparisons Perpetrators (n = 36) (n = 33) (n = 32) (n = 25)

Category f % f % f % f %

Self-defense 2 6 3 9 3 9 1 4

Control motives 8 22 9 27 9 28 3 12

Expressive aggression 4 11 3 9 3 9 8 32

Face/self-esteem preservation 1 3 2 ˜6 2 6 3 12

Exculpatory explanations 5 14 3 9 3 9 3 12

Rejection of perpetrator or act 12 33 6 18 10 31 7 28

Prosocial/acceptable explanations 0 0 0 0 0 0 0 0

Tied categories 4 11 7 21 2 6 0 0

Note: f� frequency. % � percentage of respondents in a given group who provided a particular category of explanation. Source: Mouradian, V. E. (2001). Applying schema theory to intimate aggression: Individual and gender differences in represen- tation of contexts and goals. Journal of Applied Social Psychology, 31, 376–408. Copyright © 2001 by Blackwell Publishing. Reprinted by permission of Blackwell Publishers Journals.

Using SPSS

The U in the following steps indicates a mouse click. (We used SPSS version 15.0 to carry out these analyses. The steps and output may be slightly different for other versions of SPSS.)

Creating a Frequency Table ❶ Enter the scores from your distribution in one column of the data window. ❷ U Analyze. ❸ U Descriptive statistics. ❹ U Frequencies. ❺ U the variable you want to make a frequency table of and then U the arrow. ❻ U OK.

Practice the preceding steps by creating a frequency table for the social interactions example in this chapter (the scores are listed on p. 8). After Step ❺, your screen should look like Figure 1–18. Your output window (which appears after you U on OK in Step ❻) should look like Figure 1–19. As you will see, SPSS automatically produces a column with the cumulative percentage (or percentile) for each value. (Note that it is possible to create grouped frequency tables in SPSS, but since it is not a straightforward process, we do not cover it here.)

IS B

N 0-

55 8-

46 76

1- X

30 Chapter 1

Creating a Histogram ❶ Enter the scores from your distribution in one column of the data window. ❷ U Analyze. ❸ U Descriptive statistics. ❹ U Frequencies. ❺ U the variable you want to make a histogram of and then U on the arrow. ❻ U Charts, U Histograms, U Continue. ❼ Optional: To instruct SPSS not to produce a frequency table, U the box labeled

Display frequency tables (this unchecks the box). ❽ U OK.

Figure 1–18 SPSS data window and frequencies window for the social interactions example. (Data from McLaughlin-Volpe et al., 2001.)

IS B

N 0-558-46761-X

Displaying the Order in a Group of Numbers 31

Figure 1–19 SPSS frequency table for the social interactions example. (Data from McLaughlin-Volpe et al., 2001.)

Practice these steps by creating a histogram for the social interactions example in this chapter (the scores are listed on p. 8). Your output window should look like Figure 1–20. Notice that SPSS automatically creates a histogram based on a grouped frequency table, with an interval in this case of 3 (1–3, 4–6, 7–9, and so on). (Should you wish, you can change the number of intervals or the interval size for the his- togram by doing the following: Place your mouse cursor on the histogram and dou- ble U to bring up a Chart Editor window; place your mouse cursor over one of the bars in the histogram and double U to bring up a Properties window; U the tab la- beled Binning; U Custom; then enter the number of intervals you want for the inter- val size, labeled Interval Width; U Apply.) (If you want a nongrouped histogram, type in “1” for the interval size.)