T-Test mtcars

Note: This article shows an R example on how to conduct a t-test on the dataset mtcars. For general information about t-tests please refer to the following article: T-Test.

In short: In this article a t-test is conducted on the dataset mtcars which examines if the mean fuel efficiency (mpg) differs significantly between automatic and manual cars. The test examines whether the vehicle transmission has an impact on fuel consumption.

The Dataset[edit]

The dataset mtcars is included in the R base package and contains information on fuel consumption and 10 aspects of automobile design and performance for 32 automobiles.

The variables that will be included in the t-test are:

  • mpg: Miles/(US) gallon —> fuel efficiency measurement for cars (continuous datatype)
  • am: Transmission (0 = automatic, 1 = manual)

Inspecting the Dataset[edit]

The command head() displays the first entries of the data frame which provides an initial overview of the data. Set.the command str() display the internal structure of the data.

# Inspect the dataset
head(mtcars)
str(mtcars)

Output[edit]

> head(mtcars)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1
> str(mtcars)
'data.frame':	32 obs. of  11 variables:
 $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
 $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
 $ disp: num  160 160 108 258 360 ...
 $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
 $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
 $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
 $ qsec: num  16.5 17 18.6 19.4 17 ...
 $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
 $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
 $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
 $ carb: num  4 4 1 1 2 1 4 2 2 4 ...

We can see that the transmission types are represented by 0 (automatic) and 1 (manual) in the am column. Furthermore, the am and mpg variables are stored as numeric data.

Splitting the Values[edit]

# Split mpg values by transmission type

mpg_auto <- mtcars$mpg[mtcars$am == 0]
mpg_manual <- mtcars$mpg[mtcars$am == 1]

To perform the t-test we split the mpg values by transmission types to have two groups to compare to each other. Therefore we create two new variables mpg_auto and mpg_manual. In R, a variable is created by storing the data on the right side of the arrow under the name on the left side of the arrow. Let’s have a closer look at the variable mpg_auto. The code on the right-hand side translates as: Take the data from the mtcars data set from the mpg column, but only if the value in the am column is 0. This means that only the values for automatic cars are taken and used in the new variable mpg_autos. We proceed similarly with the variable mpg_maual. Only now the criterion must be met that the value in the am column is 1.

Checking Assumptions[edit]

Before conducting the t-test we can examine the dataset and check for the different assumptions of a t-test to make sure that it produces valid results. The assumptions of a t-test are:

  • response variable should be continuous or ordinal
  • normal distribution for each factor or a sufficiently large sample size (≥ 30)
  • data must be drawn randomly from a representative sample
  • student’s t-test requires equal variance in the two groups; welch t-test can deal with unequal variance
#check for assumptions
shapiro.test(mpg_auto)
shapiro.test(mpg_manual)

#equal variance between groups 
var.test(mpg_auto,mpg_manual)
boxplot(mpg ~ am, data = mtcars)

We have already seen that the mpg values are continuous data and it can be assumed that the data were drawn randomly from a representative sample. We can test the other assumptions by using the shapiro test and the f-test. We use shapiro.test() to test whether the two groups are normally distributed. Var.test() is used to check whether the variance of the two groups is the same.

Output[edit]

Fig. 1: A boxplot to compare mpg distributions
> shapiro.test(mpg_auto)

	Shapiro-Wilk normality test

data:  mpg_auto
W = 0.97677, p-value = 0.8987

> shapiro.test(mpg_manual)

	Shapiro-Wilk normality test

data:  mpg_manual
W = 0.9458, p-value = 0.5363

> var.test(mpg_auto,mpg_manual)

	F test to compare two variances

data:  mpg_auto and mpg_manual
F = 0.38656, num df = 18, denom df = 12, p-value = 0.06691
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.1243721 1.0703429
sample estimates:
ratio of variances 
         0.3865615

The Shapiro-Wilk test delivers a non-significant p-value which means we cannot reject the 0-hypothesis that the data is normally distributed. We thus assume that mpg is normally distributed. The p-value of the f-test is above 0,05, so the variance doesn’t differ significantly.

T-Test[edit]

# Perform an independent two-sample t-test

t_test_result <- t.test(mpg_manual, mpg_auto,
                        var.equal = TRUE)            # Welch's t-test (default)

# Print the test result

print(t_test_result)

To perform the t-test the command t.test() is used. Since mpg_manual and mpg_auto are being compared with each other, they are placed in the brackets after t.test. This is a two-tailed test, which is why the order in which the two variables are named does not matter. Alternative = ‘two.sided’ specifies that it is a two-tailed test, but this is also used by default by R in the command, which is why we don’t need necessarily to write it in this case. The variance of the two groups is specified as equal in the code through var.equal = TRUE. Therefore, R uses the students t-test which assume that the variances are equal. The Welch t-test is used by R by default and could also be used. You can also use this test when the variance differs. The results of the test on the right side of the arrow are stored again in the variable on the left side. The results are then displayed in R using the print() command.

Output[edit]

Two Sample t-test

data:  mpg_manual and mpg_auto
t = 4.1061, df = 30, p-value = 0.000285
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
  3.64151 10.84837
sample estimates:
mean of x mean of y 
 24.39231  17.14737

The hypothesis which is tested is:

H0: There is no difference in mean fuel efficiency (mpg) between manual and automatic cars (μ1 = μ2)
H1: There is a difference in mean fuel efficiency (mpg) between manual and automatic cars (μ1 ≠ μ2)

The p-value of 0.001374 is less than 0.05. Thus, the null hypothesis can be rejected, which means there is a significant difference between the two groups. It can be concluded that the vehicle transmission has an impact on fuel consumption.

Visualise the difference[edit]

# Basic boxplot to compare tooth length distributions

### Basic boxplot to compare mpg distributions

boxplot(mpg ~ am, data = mtcars,
        names = c("Automatic", "Manual"),
        col = c("skyblue", "lightgreen"),
        main = "MPG by Transmission Type",
        xlab = "Transmission Type",
        ylab = "Miles Per Gallon")

### Add mean points

points(x = c(1, 2),
       y = tapply(mtcars$mpg, mtcars$am, mean),
       pch = 19, col = "red")

Output[edit]

The mean values are marked as red dots in the box plot. As can be seen, they differ clearly from one another, which is also confirmed by the T-test.

Fig. 2: A boxplot to compare mpg distributions with mean points


















The author of this entry is Hauke Haese. Last edited: 06.02.2026