TwoWay ANOVA is a statistical technique used to analyze the effects of two independent categorical variables on a continuous dependent variable.
Definition: Twoway ANOVA
ANOVA stands for Analysis of Variance, a statistical method used to determine whether there are significant differences between two or more groups. It is used to compare the means of two or more groups and determine if they are significantly different from each other.
When is a twoway ANOVA used?
A twoway ANOVA is appropriate when you have gathered data on a continuous dependent variable measured at different levels of two categorical independent variables.
The dependent variable in twoway ANOVA can be a numerical measure of a characteristic or behavior and can be averaged across groups to calculate the mean value.^{1}
Salary is a quantitative variable because it represents income. It can be divided to find the average salary per person.
A categorical variable represents a set of categories or groups. It is a variable that can take on one of a limited number of values or levels, which are often represented by labels or names.^{2}
Gender types male and female are levels within the categorical variable gender type. Age groups, 1,2 and 3 are levels within the categorical variable age group.
The Function of the twoway ANOVA
The twoway ANOVA utilizes the F test to determine the statistical significance of the differences between groups. The F test compares the variability in each group mean to the overall variance in the dependent variable in what is known as a groupwise comparison test.
In a twoway ANOVA with interaction, three hypotheses can be tested:
 There is no significant difference
between the means of the groups formed by varying factor 1.  There is no significant difference
between the means of the groups formed by varying factor 2  There is no significant difference
in the means of the groups formed by varying the levels of factor 1 and factor 2 together.^{3}
In contrast, a twoway ANOVA with no interaction tests whether each factor has a main effect on the dependent variable but no interaction between the factors.
In our average salary experiment, we can use twoway ANOVA to test three hypothesis:
Null hypothesis (H_{0})  Alternate hypothesis (H_{a}) 
There is no difference in average salary for any gender type 
There is a difference in average salary by gender type 
There is no difference in average salary at any age bracket  There is a difference in average salary at any age bracket 
The effect of one independent variable on average salary does not depend on the effect of the other independent variable (a.k.a. no interaction effect)  There is an interaction effect between age group and gender type on average salary 
Twoway ANOVA assumptions
A twoway ANOVA makes several assumptions about the data and the statistical model that must be met for the results to be reliable and valid.

Homogeneity of variance
The variance of the dependent variable should be equal across all groups. Use a nonparametric test like KruskalWallis test if your data set fails to exhibit homogeneity. 
Independence of observations
In a twoway ANOVA, the observations should be independent of each other. This means that the values of the dependent variable in one group should not be related to the values in any other group. 
Normallydistributed dependent variable
The data within each group should follow a normal distribution. This can be checked using normal probability plots or other tests of normality.
Conducting a twoway ANOVA
The dataset from our income experiment includes observations of:
 Income (average salary per person)
 Gender type (male, female)
 Age group (1 = 1830, 2 = 3150, or 3= 51 and above)
 Industry (1, 2, 3, 4)
Twoway ANOVA in R
The twoway ANOVA will test whether the independent variables (gender type and age group) affect the dependent variable (average salary). But there are some other possible sources of variation in the data that we want to take into account.
After loading the data into the R environment, we will create each of the three models using the aov() command, and then compare them using the aictab() command.
Twoway ANOVA R code
two.way aov(salary ~ gender + age group, data = worker.data)
In the second model, to test whether the interaction of gender and age group influences the salary, use a ‘ * ‘ to specify that you also want to know the interaction effect.
Twoway ANOVA with interaction R code
interaction aov(salary ~ gender* age group, data = worker.data)
Because our workers were randomized within industries, we add this variable as a blocking factor in the third model. We can then compare our twoway ANOVAs with and without the blocking variable to see whether the industry matters.
Twoway ANOVA with blocking R code
blocking aov(salary~ gender * age group + block, data = worker.data)
Model comparison
We can use Akaike information criterion (AIC) to calculate the bestfit model by finding the model that uses the fewest parameters to explain the largest variation. We can use the aictab() to perform a model comparison.
AIC R Sample code
library(AICcmodavg)
model.set list(two.way, interaction, blocking)
model.names c(“two.way”, “interaction”, “blocking”)
aictab(model.set, modnames = model.names)
TwoWay ANOVA – Result interpretation
The output looks like this:
Df  Sum Sq  Mean Sq  F value  Pr(>F)  
Gender  2  6.068  3.034  9.073  0.000253 *** 
Age  1  5.122  5.122  15.316  0.000174 *** 
Residuals  92  30.765  0.334  
Signif. codes:  0 `***' 0.001 "*' 0.01 "1 0.05 0.1 ‘’ 1 
The model can be interpreted using the following columns:

Df
displays the degrees of freedom for each variable, which is equal to the number of levels in the variable minus 1. 
Sum sq
represents the sum of squares, which is the variation between the group means created by the independent variable levels and the overall mean. 
Mean sq
indicates the mean sum of squares, which is the sum of squares divided by the degrees of freedom. 
F value
is the test statistic obtained from the F test, which is calculated by dividing the mean square of the variable by the mean square of each parameter. 
Pr(>F)
indicates the pvalue of the F statistic, which represents the likelihood that the calculated F value from the F test would occur if the null hypothesis of no difference were true.
Posthoc test
A posthoc test will be used to test which levels are actually different from each other since ANOVA only shows which parameters are significant. We use the Tukey’s HonestlySignificantDifference (TukeyHSD) test as shown below:
Tukey R code
TukeyHSD(two.way)
The output looks like this:
Tukey multiple comparisons of means 95% familywise confidence level
Fit: aov(formula = salary – gender + age, data = worker.data)
Twoway ANOVA – Result presentation
FAQs
A oneway ANOVA is used to test for differences between two or more groups on a single independent variable, whereas a twoway ANOVA is used to test for differences between groups on two independent variables, and their interaction effect on a single dependent variable.
ANOVA is typically used when you want to determine if there is a significant difference between the means of two or more groups.
The assumptions of ANOVA include normality, homogeneity of variances, and independence of observations.
The results of a twoway ANOVA test are typically reported as Fstatistics, with a corresponding pvalue that indicates the statistical significance of the differences between the means of the groups being compared.
Sources
^{1} Zach. “TwoWay ANOVA in Excel: Definition, Formula and Example.” Statology. December 30, 2018. https://www.statology.org/twowayanova/.
^{2} UCLA Office of Academic Computing. “What Statistical Analysis Should I Use? Statistical Analyses Using SPSS.” UCLA. Accessed on February 22, 2023. https://stats.oarc.ucla.edu/spss/whatstat/whatstatisticalanalysisshouldiusestatisticalanalysesusingspss/.
^{3} Jones, James F. “Chapter 13 – TwoWay Analysis of Variance.” James F. Jones Online Math Textbook. Accessed on February 22, 2023. https://people.richland.edu/james/lecture/m170/ch132wy.html.