Two-way ANOVA CO2
Note: This article shows an R example on how to conduct a two-way analysis of variance (ANOVA) on the dataset CO2. For general information about ANOVA please refer to the following article: ANOVA.
In short: In this example a two-way ANOVA is used to analyse the uptake of CO2 of a plant based on its origin and its treatment as well as the interaction of those factors. This examines if chilling the plants overnight and the origin of the respective plant have an influence on its ability to take up CO2. The effect is then visualized in an interaction plot.
To run the script you can open RStudio and paste this code into a script or console.
If needed, you can install the ggplot2 library by running
install.packages("ggplot2").
Contents
The Dataset[edit]
The dataset CO2 in included in the R base package and displays the uptake of six plants from Quebec and six plants from Mississippi at several levels of ambient CO2 concentration. Half the plants of each type were chilled overnight before the experiment was conducted.
The variables we will look at in our analysis are:
- Uptake: a numeric vector containing carbon dioxide uptake rates of the examined plants
- Type: a factor containing information on the origin of the respective plant with the factor levels “Quebec” and “Mississippi”.
- Treatment: a factor containing information on the treatment over night before the experiment was conducted with the factor levels “chilled” and “nonchilled”.
Examining the Dataset[edit]
Before conducting the analysis, we can examine the dataset and check for the different assumptions of a two-way ANOVA to make sure that the ANOVA produces valid results. The assumptions of the ANOVA are:
- Normality of the residuals (This can roughly be approximated by checking the normality of the dependent variable uptake)
- Equal variance for each combination of factor levels.
- Independent measurements
- Equal size in each treatment group (the combination of Type and Treatment)
#Load necessary library library(ggplot2) library(datasets) #View the structure of the CO2 dataset str(CO2) #boxplot to check for equal variance (Fig.1) boxplot(uptake ~ Type*Treatment, data = CO2) #table to check for balanced design table(CO2$Treatment, CO2$Type)
Output[edit]
The analysis delivers the following results:
> str(CO2)
Classes ‘nfnGroupedData’, ‘nfGroupedData’, ‘groupedData’ and 'data.frame': 84 obs. of 5 variables:
$ Plant : Ord.factor w/ 12 levels "Qn1"<"Qn2"<"Qn3"<..: 1 1 1 1 1 1 1 2 2 2 ...
$ Type : Factor w/ 2 levels "Quebec","Mississippi": 1 1 1 1 1 1 1 1 1 1 ...
$ Treatment: Factor w/ 2 levels "nonchilled","chilled": 1 1 1 1 1 1 1 1 1 1 ...
> table(CO2$Treatment, CO2$Type)
Quebec Mississippi
nonchilled 21 21
chilled 21 21
>
The independent measurements are a characteristic of the dataset which is given in this case. The normality and variance between the factor combinations can be approximated with boxplots(Fig. 1) that visualize the interaction between Type and Treatment. The variance is roughly similar, and no distribution seems to be completely skewed. The design is balanced. The assumptions are thus met sufficiently.
Two-way ANOVA[edit]
An ANOVA is performed where
uptake ~ Type * Treatment
tests the effects of Type, Treatment, and their interaction on uptake. An interaction means that the effect of one variable depends on the effect of another variable. In this case we test if Treatment depends on Type.
summary(anova_result)
will show the ANOVA table with F-values, p-values, etc., to determine the significance of each factor.
#Perform ANOVA to analyse CO2 uptake based on Type and Treatment from the dataset CO2 anova_result <- aov(uptake ~ Type * Treatment, data = CO2) #Display the ANOVA table summary(anova_result) #Optional: To visualize the interaction between factors Type and Treatment (Fig.2) ggplot(CO2, aes(x = Treatment, y = uptake, color = Type)) + geom_point() + geom_line(aes(group = Type)) + labs(title = "CO2 Uptake by Treatment and Type", x = "Treatment", y = "CO2 Uptake") + theme_minimal()
Output[edit]
The analysis delivers the following results:
Df Sum Sq Mean Sq F value Pr(>F) Type 1 3366 3366 52.509 2.38e-10 *** Treatment 1 988 988 15.416 0.000182 *** Type:Treatment 1 226 226 3.522 0.064213 . Residuals 80 5128 64 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
- Df: The Degrees of freedom are a measure of the Independent values that can vary in our design. For each factor they are calculated by n-1.
- Sum Sq: The sum of squares is a measure of how much variance is explained by the independent variables. The group Sum Sq plus the residuals Sum Sq equals to the total variation. If we divide Sum Sq of one variable by the total variation, we get the explained variation by variable group. The factor Type explains 3366 out of 9708 of the variation. The interaction explains 226 out of 9708 of the variation.
- Pr: The P - value indicates the probability that the 0 hypothesis that there is no effect on uptake is true. For The single effect of type and treatment, the P-value is low enough (>0.05) so we can reject the 0-hypothesis and assume a significant effect on the uptake of CO2 by each independent variable individually. For the interaction effect of both factors the p - value is above 0.05 so the 0 - hypothesis cannot be rejected, and we don't assume a significant interaction effect between Type and Treatment. However, the P-value is only marginally above 0.05. To evaluate the results, the other values should therefore also be taken into account.
The interaction effect can be visualized in an interaction plot(Fig. 2). This type of plot displays the values of the dependent variable Uptake on the y-axis. The x-axis shows the values of the predictor Treatment. The lines show the second predictor Type. Parallel lines indicate no interaction effect. Different slopes indicate an interaction effect. The lines a fairly parallel. This visualizes the result of the ANOVA, which indicates no significant interaction effect.
Further Resources[edit]
CO2 dataset: RDocomentation
ANOVA: Cookbook for R
Q-Q plot: Cookbook for R
The author of this entry is Jana Simon