## Lesson 3: Descriptive Statistics and Graphs

#### Objectives

- Compute descriptive statistics.
- Compare means for different groups.
- Display frequency distributions and histograms.
- Display boxplots.

#### Overview

In this lesson, you will learn how to produce various descriptive statistics, simple frequency distribution tables, and frequency histograms. You will also learn how to explore your data and create boxplots.

#### Example

Let us return to our example of 20 students and five quizzes. We would like to calculate the average score (mean) and standard deviation for each quiz. We will also look at the mean scores for men and women on each quiz. Open the SPSS data file you saved in Lesson 2, or click here for lesson_3.sav. Remember that we previously calculated the average quiz score for each person and included that as a new variable in our data file.

To calculate the means and standard deviations for age, all quizzes, and the average quiz score, select **Analyze**, then **Descriptive Statistics**, and then **Descriptives** as shown in the following screenshot (see Figure 3-1).

Figure 3-1 Accessing the Descriptives Procedure

Move the desired variables into the variables window (see Figure 3-2) and then Click **OK**.

Figure 3-2 Move the desired variables into the variables window.

In the resulting dialog box, make sure you check (at a minimum) the boxes in front of Mean and Std. deviation:

Figure 3-3 Descriptives options

The resulting output table showing the means and standard deviations of the variables is opened in the SPSS Viewer (see Figure 3-4).

Figure 3-4 Output from Descriptives Procedure

#### Exploring Means for Different Groups

When you have two or more groups, you may want to examine the means for each group as well as the overall mean. The SPSS Compare Means procedure provides this functionality and much more, including various hypothesis tests. Assume that you want to compare the means of men and women on age, the five quizzes, and the average quiz score. Select **Analyze**, **Compare** **Means**, **Means** (see Figure 3-5):

Figure 3-5 Selecting Means Procedure

Click **OK**, and then in the resulting dialog box, move the variables you are interested in summarizing into the **Dependent** **List**. At this point, do not worry whether your variables are actual "dependent variables" or not. Move Sex to the **Independent List **(see Figure 3-6). Click on **Options** to see the many summary statistics available. In the current case, make sure that Mean, Number of Cases, and Standard Deviation are selected.

Figure 3-6 Means dialog box

When you click **OK**, the report table appears in the SPSS Viewer with the separate means for the two sexes along with the overall data, as shown in the following figure.

Figure 3-7 Report from Means procedure

As this lesson makes clear, there are several ways to produce summary statistics such as means and standard deviations in SPSS. From Lesson 2 you may recall that splitting the file would allow you to calculate the descriptive statistics separately for males and females. The way to find the procedure that works best in a given situation is to try different ones, and always to explore the options presented in the SPSS menus and dialog boxes. The extensive SPSS help files and tutorials are also very useful.

#### Frequency Distributions and Histograms

SPSS provides several different ways to explore, summarize, and present data in graphic form. For many procedures, graphs and plots are available as output options. SPSS also has an extensive interactive chart gallery and a chart builder that can be accessed through the **Graphs** menu. We will look at only a few of these features, and the interested reader is encouraged to explore the many additional charting and graphing features of SPSS.

One very useful feature of the Frequencies procedure in SPSS is that it can produce simple frequency tables and histograms. You may optionally choose to have the normal curve superimposed on the histogram for a visual check as to how the data are distributed. Let us examine the distribution of ages of our 20 hypothetical students. Select **Analyze**, **Descriptive Statistics**, **Frequencies** (see Figure 3-8).

Figure 3-8 Selecting Frequencies procedure

In the Frequencies dialog, move Age to the variables window, and then click on Charts. Select Histograms and check the box in front of With normal curve (see Figure 3-9).

Figure 3-9 Frequencies: Charts dialog

Click **Continue** and **OK**. In the resulting output, SPSS displays the simple frequency table for age and the frequency histogram with the normal curve (see Figures 3-10 and 3-11).

Figure 3-10 Simple frequency table

Figure 3-11 Frequency histogram with normal curve

#### Exploratory Data Analysis

In addition to the standard descriptive statistics and frequency distributions and graphs, SPSS also provides many graphical and semi-graphical techniques collectively referred to as exploratory data analysis (EDA). EDA is useful for describing the characteristics of a dataset, identifying outliers, and providing summary descriptions. Some of the most widely-used EDA techniques are boxplots and stem-and-leaf displays. You can access these techniques through the commands found through **Analyze**, **Descriptive Statistics**, **Exlpore**. As with the Compare Means procedure, groups can be separated if desired. For example, a side-by-side boxplot comparing the average quiz grades of men and women is shown in Figure 3-12.

Figure 3-12 Boxplots