# Academic Writing

## Rhetorical functions in academic writing: Presenting findings from statistical analyses

### Introduction

A common feature of any kind of primary research is that empirical data is used to support arguments and claims. In quantitative research this evidence is almost entirely number-based statistics. In a study, such as a questionnaire study, you will need to analyse the results statistically and include your results in tables and graphs to illustrate and support your findings.

When writing about your findings in the results section of your report, it is important to remember that the purpose is to present the results of your data analysis. It is normally not appropriate at this stage to discuss these results. That takes place in the discussion and conclusions.

The primary purpose of the results section is to present the data in a standard way. It is important to structure the results section, addressing each hypothesis in order. The normal format is for the results of the research to be reported factually and formally without detailed analysis. Results are presented both verbally and with ﬁgures and tables to help understanding.

There are standard conventions to follow when reporting statistics. It is usual to start by providing a description of the statistical test(s) used. This is followed by descriptive statistics, such as central tendency and standard deviation/variance. These may then be followed by the inferential statistics, such as* t*-tests or correlations. When reporting the results of the ﬁndings from inferential tests, it is important to include the obtained value of the test statistic, the degrees of freedom, and level of probability with the implications for the null and alternative hypotheses. In addition, it is common for effect sizes to be reported.

This information is reported in a concise statement, such as:

An independent-samples t-test was conducted to evaluate the hypothesis that using mental images produces a significant difference in memory performance. The group using mental images recalled more words ( (Gravetter & Wallnau, 1996, p. 299). |

In the first sentence the hypothesis to be tested and the test used is introduced. In the second sentence, the mean (*M* = 25) and the standard deviation (*SD* = 4.71) are presented. The next sentence provides the results of the statistical analysis. Note that the degrees of freedom are reported in parentheses immediately after the symbol *t*. The value for the obtained *t* statistic follows (-3.00), and next is the probability of committing a Type I error (less than 5%). Finally, the type of test (one versus two-tailed) is noted.

We have carried out an independent-samples t-test to compare the happiness scores for American men and women. There was a signiﬁcant difference in scores for men ( (Dörnyei, 2007. p. 217) |

In the first sentence the hypothesis to be tested and the test used is introduced. In the second sentence, the mean (*M* = 1.76) and the standard deviation (*SD* = .69) are presented. The next sentence provides the results of the statistical analysis. Note that the degrees of freedom are reported in parentheses immediately after the symbol * t*. The value for the obtained* t* statistic follows (-2.02), and next is the probability of committing a Type I error (less than 5%). Finally, the effect size (.003) is noted.

### APA

According to the American Psychological Association (2020, p. 181):

When reporting inferential statistics (e.g.,* t* tests, *F* tests, and chi-square tests and associated effect sizes and confidence intervals), include information to allow readers to fully understand the analyses conducted. The data supplied, preferably i the text but possibly in supplemental materials depending on the magnitude of the data arrays, should allow readers to confirm the basic reported analyses (e.g., cell means, standard deviations, sample sizes, correlations) and should enable interested readers to construct some effect-size estimates and confidence intervals beyond those supplied in the paper per se. In the case of multilevel data, present summary statistics for each level of aggravation. What constitutes sufficient information depends on the analytic approach reported.

### Null Hypothesis

To begin the quantitative research process, the researcher often states two opposing hypotheses:

- The first is the null hypothesis, or
*H*. This hypothesis states that the treatment has no eﬂect, that there is no change, no difference, that nothing happened. The null hypothesis (_{0}*H*) predicts that the independent variable (treatment) has no effect on the dependent variable for the population._{0} - The second hypothesis is usually called the alternative hypothesis (
*H*). This hypothesis states that the treatment dies have an effect on the dependent variable. The alternative hypothesis_{1}*(H*) predicts that the independent variable (treatment) does have an effect on the dependent variable for the population._{1}

After data collection, the researcher compares the data with the null hypothesis and makes a decision according to criteria established earlier. There are two possible decisions, and both are stated in terms of the null hypothesis.

- One possibility is that the researcher decides to reject the null hypothesis.In this case, the data provides strong evidence that the treatment does have an effect.
- The second possibility is to fail to reject the null hypothesis.In this case, the data does not provide evidence that the treatment has an effect.

**Rejecting or disproving the null hypothesis is a central task in modern scientific practice.**

However, when you are writing about your findings, you will not usually write about the null hypothesis. In research reports, the researcher does not actually state that “the null hypothesis was rejected.” Instead, you report that the effect of the treatment was statistically significant. Likewise, when H_{0} is not rejected, you simply state that the treatment effect was not statistically significant or that there was no evidence for a treatment effect. In fact, when you read scientific reports, you will note that the terms null hypothesis and alternative hypothesis are rarely mentioned.

Findings are said to be statistically significant when the null hypothesis has been rejected. Thus, if results achieve statistical significance, the researcher concludes that a treatment effect occurred.

(Gravetter & Wallnau, 1996, chs. 8 & 9).

### Example

Figure 1 shows the mean attractiveness ratings given by participants in each of the four experimental conditions: When participants were drunk, the attractiveness ratings were higher than when participants were sober, supporting the idea that the beer-goggles effect is alcohol dependent. The level of lighting appeared to have an effect in sober participants who rated the stooges as more attractive in dim lighting,
A two-way 2 (alcohol: 0 pints or 6 pints) × 2 (lighting: dim vs. bright) repeated-measures ANOVA was conducted on the attractiveness ratings. This revealed a significant main effect of alcohol, (Field, 2016) |

### Examples from Textbooks

#### Presenting descriptive statistics

Mean sales for the organisation’s 30 employees were £46,600. As the mean, median and mode are virtually the same, this suggests these data are normally distributed. Consequently the standard deviation of 18.46 indicates that 95 per cent of sales fell within the range £10,318 to £82,682, the complete range being £68,000. (Saunders & Lewis, 2012, p: 178) |

#### Presenting the strength of relationship between pairs of variables

There is a statistically significant strong positive relationship between the number of enquiries and the number of sales ( (Saunders, Lewis & Thornhill, 2012. p. 522) |

#### Presenting the results from correlation

The relationship between perceived control of internal states (as measured by the PCOISS) and perceived stress (as measured by the Perceived Stress Scale) was investigated using Pearson product-moment correlation coefficient. Preliminary analyses were performed to ensure no violation of the assumptions of normality, linearity and homoscedasticity. There was a strong, negative correlation between the two variables, (Pallant, 2010, p. 135) |

A set of Pearson correlations were computed to determine if there were any significant relationships between a number of employee variables. The correlation between starting and current salary is +.735; this is significant at the .01 level. The null hypothesis can be rejected. Starting salary appears to provide a moderate guide to current salary as it predicts around 54% of current salary level. The remainder of the unexplained variance may involve inter alia qualifications/skills developed over the time period and differential opportunities for promotion. (Burns & Burns, 2008, p. 354) |

#### Presenting the results from regression

A linear regression analysis was conducted to evaluate the prediction of monthly sales value from floor area of a set of 14 branches of a large multiple store. The scattergraph indicates that they are positively and strongly linearly related such that as floor area increases so does monthly sales income, in fact by $1,686 per sq mt. A histogram and residual plots indicate that linear regression assumptions are met. (Burns & Burns, 2008, p. 384) |

#### Presenting the results from multiple regression

Hierarchical multiple regression was used to assess the ability of two control measures (Mastery Scale, Perceived Control of Internal States Scale: PCOISS) to predict levels of stress (Perceived Stress Scale), after controlling for the influence of social desirability and age. Preliminary analyses were conducted to ensure no violation of the assumptions of normality, linearity, multicollinearity and homoscedasticity. Age and social desirability were entered at Step 1, explaining 6% of the variance in perceived stress. After entry of the Mastery Scale and PCOISS Scale at Step 2 the total variance explained by the model as a whole was 47.4%, (Pallant, 2010, p. 167) |

#### Reporting the output from chi-square for goodness of fit

A chi-square goodness-of-fit test indicates there was no significant difference in the proportion of smokers identified in the current sample (19.5%) as compared with the value of 20% that was obtained in a previous nationwide study, (Pallant, 2010, p. 216) |

#### Reporting the output from chi-square test for independence

A chi-square test for independence indicated no significant association between gender and smoking status, (Pallant, 2010, p. 222) |

#### Presenting the results for independent samples t-test

An independent-samples t-test was run to determine if there were differences in engagement to an advertisement between males and females. There were no outliers in the data, as assessed by inspection of a boxplot. Engagement scores for each level of gender were normally distributed, as assessed by Shapiro-Wilks test ( |

An independent-samples (Pallant, 2010, p. 243) |

An independent-samples t-test was conducted to evaluate the hypothesis that smokers and non-smokers differ significantly in their self-concept levels. The mean self-concept score of non-smokers ( (Burns & Burns , 2008. pp. 268-269) |

We have carried out an independent-samples t-test to compare the happiness scores for American men and women. There was a signiﬁcant difference in scores for men ( (Dörnyei, 2007. p. 217) |

The one-degree-of-freedom contrast of primary interest was significant at the specified (American Psychological Association, 2020, p. 182) |

#### Reporting the output for Mann-Whitney test

The Mann-Whitney U-test showed that there was no significant difference in absence rates in 2006 between male and female employees ( (Burns & Burns, 2008, p. 272) |

#### Presenting the results for paired samples t-test

A paired samples t test ( (Burns & Burns, 2008, p. 276) |

#### Presenting the results from one-way between-groups ANOVA

A one-way analysis of variance indicated that there was a signiﬁcant difference in happiness amongst white people ( (Dörnyei, 2007, p. 221) |

#### Presenting the results from one-way between-groups ANOVA with post-hoc tests

A one-way between-groups analysis of variance was conducted to explore the impact of age on levels of optimism, as measured by the Life Orientation Test (LOT). Participants were divided into three groups according to their age (Group 1: 29yrs or less; Group 2: 30 to 44yrs; Group 3: 45yrs and above). There was a statistically significant difference at the (Pallant, 2010, p. 255) |

#### Presenting effect size

We have carried out an independent-samples t-test to compare the happiness scores for American men and women. There was a signiﬁcant difference in scores for men ( (Dörnyei, 2007. p. 217) |

An independent-samples t-test was conducted to evaluate the hypothesis that smokers and non-smokers differ significantly in their self-concept levels. The mean self-concept score of non-smokers ( (Burns & Burns , 2008. pp. 268-269) |

A paired samples (Burns & Burns, 2008, p. 276) |

A one-way analysis of variance indicated that there was a signiﬁcant difference in happiness amongst white people ( (Dörnyei, 2007, p. 221) |

A one-way between-groups analysis of variance was conducted to explore the impact of age on levels of optimism, as measured by the Life Orientation Test (LOT). Participants were divided into three groups according to their age (Group 1: 29yrs or less; Group 2: 30 to 44yrs; Group 3: 45yrs and above). There was a statistically significant difference at the (Pallant, 2010, p. 255) |

### Examples from Published Research.

On average, participants reported that they took the evaluation process somewhat seriously ( Bassett, J., Cleveland, A., Acorn, D., Nix, M. & Snyder, T. (2015). Are they paying attention? Students’ lack of motivation and attention potentially threaten the utility of course evaluations. |

The analysis revealed reliably higher percentages of overlap when participants were required to cite three sources ( F(1,85) = .96, ns.Youmans, R. J. (2011). Does the adoption of plagiarism-detection software in higher education reduce plagiarism? |

There was a significant effect of condition upon self-reported disgust [interpersonal, p < 0.001) and that there was no significant difference between the interpersonal and outgroup conditions.Reicher, S. D., Templeton, A., Neville, F., Ferrari, L. & Drury, J. (2015). Core disgust is attenuated by ingroup relations. |

Results indicate that laptop use by fellow students was the single most reported distracter (n = 229), accounting for 64% of all responses. This was significantly greater than all other responses combined ( Fried, C. B. (2008). In-class laptop use and its effects on student learning. |

With respect to the effectiveness of annotation type, Table 1 shows that the Combination group consistently outperformed the other two groups across all measures. The Combination group did significantly better than the Text-only group for the Picture Recognition test: the effect size was .59, a medium effect according to Cohen (1988). The Combination group also outperformed the Picture-only group significantly for the Definition Supply test (strict): the effect size was .61 (a medium effect). Yoshii, M. & Flaitz, J. (2002). Second language incidental vocabulary retention: The effect of text and picture annotation types. |

In sex differences in human mate preferences conducted in 37 cultures, Buss (1989) reported that males value physical attractiveness in potential mate smore than females do in 34cultures (mean Cohen’s d=0.59). Guéguen, N. (2015). High heels increase women’s attractiveness. |

However, L2 writing score is not significantly correlated with LLAMA B, r = .05, LLAMA F, r = .09, and effect sizes for the correlation between L2 writing score and LLAMA total score, r =.23, LLAMA D, r = .13, as well as that between L2 writing score and productive vocabulary size, r = .23, are rather small. Yang, Y, Sun, Y, Chang, P. & Li, Y. (2019). Exploring the relationship between language aptitude, vocabulary size, and EFL graduate students’ L2 writing performance. |

See also: Writing Functions 9: Including Charts & Diagrams

### Language

#### Descriptive

The average age of participants was … (*SD* = …).

“The average age of participants was 25.5 years (*SD* = 7.94).”

The age of participants ranged from … to … years (M = …, *SD* = …).

“The age of participants ranged from 18 to 70 years (*M* = 25.5, *SD* = 7.94). “

Age was non-normally distributed, with skewness of … (*SE* = …) and kurtosis of … (*SE* = …)

“Age was non-normally distributed, with skewness of 1.87 (*SE* = 0.05) and kurtosis of 3.93 (*SE* = 0.10).”

Participants were … and …, aged … to … years.

“Participants were 98 men and 132 women, aged 17 to 25 years (men: *M* = 19.2, *SD* = 2.32; women: *M* = 19.6, *SD* = 2.54).”

#### Test used

An independent-samples* t*-test was conducted to compare …

“An independent-samples *t*-test was conducted to compare salary in manual and non-manual conditions.”

An independent-samples *t*-test was run to determine …

“An independent-samples *t*-test was run to determine if there were differences in engagement to an advertisement between males and females.”

We have carried out an independent-samples t-test to compare …

“We have carried out an independent-samples t-test to compare the happiness scores for American men and women.”

A set of Pearson correlations were computed to determine …

“A set of Pearson correlations were computed to determine if there were any significant relationships between a number of employee variables.”

A paired samples t test (*N* = …) was conducted to evaluate …

“A paired samples t test (*N* = 40) was conducted to evaluate whether there was a significant difference between initial and current salaries.”

A chi-square test was performed …

“A chi-square test was performed to investigate the relationship between gender and salary.”

Data were analysed using a mixed-design ANOVA with a within-subjects factor of … and a between-subject factor of …

“Data were analysed using a mixed-design ANOVA with a within-subjects factor of type of work (manual, semi-skilled, skilled, professional) and a between-subject factor of sex (male, female).”

A chi-square test of independence was performed to …

“A chi-square test of independence was performed to examine the relation between ethnicity and subject interest.”

A chi-square test of goodness-of-fit was performed to determine whether …

“A chi-square test of goodness-of-fit was performed to determine whether the three types of car were equally preferred.”

We ran a chi-square test to

“We ran a chi-squared test to examine whether gross national product (GNP) per capita of a country (GNPSPLIT) is related to its level of political freedom.”

Correlational analyses were used to …

“Correlational analyses were used to examine the relationship between the ages of younger and older participants’ first memories and their scores on three psychometric measures.”

#### Findings

A Mann-Whitney test indicated that …

“A Mann-Whitney test indicated that self-rated intelligence was greater for women who were not working (*Md* = 5) than for women who were working (*Md* = 4), *U* = 68.5, *p* = .035, *r* = .39.”

A chi-squared test was performed and …

“A chi-square test was performed and no relationship was found between gender and the frequency of social talk, *χ*2 (2, *N* = 170) = 1.10, *p* =.58.”

A paired-samples* t*-test indicated that …

“A paired-samples t-test indicated that scores were significantly higher for the salary scale (*M* = 26.4, *SD* = 7.41) than for the security scale (*M* = 18.0, *SD* = 9.49), *t*(721) = 23.3, *p* < .001, *d* = 0.87.”

An independent-samples *t*-test indicated that …

“An independent-samples *t*-test indicated that scores were significantly higher for women (*M* = 27.0, *SD* = 7.21) than for men (*M* = 24.2, *SD* = 7.69), *t*(734) = 4.30, *p* < .001, *d* = 0.35.”

An analysis of variance showed that …

“An analysis of variance showed that the effect of noise was significant, *F*(3,27) = 5.94, *p* = .007.”

…were positively correlated.

“Preferences for femininity in male and female faces were positively correlated, Pearson’s *r*(1282) = .13, *p* < .001.”

…were strongly positively correlated.

“Hours spent studying and GPA were strongly positively correlated, *r*(123) = .61, *p* = .011. “

… were moderately negatively correlated.

“Hours spent playing video games and GPA were moderately negatively correlated, *r*(123) = .32, *p* = .041.”

We failed to find a significant correlation between …

“We failed to find a significant correlation between their participants’ personality scores at age 14 and their scores on the same items at the age of 77.”

… reported more … than ….

“Students taking statistics courses in business at the University of Hertforshire reported studying more hours for tests (*M* = 121, *SD* = 14.2) than did UH students in in general, *t*(33) = 2.10, *p* = .034.”

… a preference for … over ….

“Results indicate a significant preference for cod and chips (*M* = 3.45, *SD* = 1.11) over haddock and chips (*M* = 3.00, *SD* = .80),* t*(15) = 4.00, *p* = .001.”

#### Significance

##### Significant

For most research, a significance level of .05 is appropriate and is generally defined as being **statistically significant.** The .01 level is used in situations where you want to make a strong demonstration of treatment effect and is generally described as being **highly s**ignificant (Gravetter & Wallnau, 1996, p. 243).

“This difference was significant, *t*(18) = -3.00, *p* < .05, two-tailed.“

*“S-N-K* post hoc tests showed that white people were signiﬁcantly happier than members of the non-white races (black and other), *p* < .05, whereas the latter two groups did not differ from each other signiﬁcantly.”

“Post-hoc comparisons using the **Tukey HSD** test indicated that the mean score for Group 1 (*M* = 21.36, *SD* = 4.55) was significantly different from Group 3 (*M* = 22.96, *SD* = 4.49). Group 2 (*M* = 22.10, *SD* = 4.15) did not differ significantly from either Group 1 or 3.”

“Results indicate a significant preference for cod and chips (*M* = 3.45, *SD* = 1.11) over haddock and chips (*M* = 3.00, *SD* = .80), *t*(15) = 4.00, *p* = .001.”

“All effects were statistically significant at the .05 significance level.”

“With an alpha level of .05, the effect of age was statistically significant, *F*(1, 123) = 7.27, *p* = .008.”

“The main effect of touch was non-significant, F(1, 108) = 2.24, p > .05. However, the interaction effect was significant, F(1, 108) = 5.55, p < .05.”

“There was a significant difference in the scores for degree (M=4.2, SD=1.3) and no degree (*M*=2.2, *SD*=0.84) conditions; t (8)=2.89, p = 0.020.”

“We found a highly significant association between schizophrenia and a COMT haplotype (*p* = 9.5×10−8)”.

##### Non-significant

“However, there is no statistically significant relationship between the number of television advertisements and the number of sales (*r* =.204, *p* = 0.131).”

“There was not a significant main effect of lighting, *F*(1, 25) = 0.50, *p *= .484, indicating that attractiveness ratings were similar overall in dim and bright conditions.”

“This is not significant and indicates a random relationship.”

“The interaction effect was non-significant, *F*(1, 24) = 1.22, *p* > .05.”

“The main effect of touch was non-significant, *F*(1, 108) = 2.24, *p* > .05. However, the interaction effect was significant, *F*(1, 108) = 5.55, *p* < .05.”

*“S-N-K* post hoc tests showed that white people were signiﬁcantly happier than members of the non-white races (black and other), *p* < .05, whereas the latter two groups did not differ from each other signiﬁcantly.”

“The effect of age was not statistically significant, *F*(1, 123) = 2.45, *p* = .12.”

### Conclusions

“These results suggest that salary really does have an effect on creativity at work. Specifically, our results suggest that when humans have a higher salary, they are more creative.”

“This suggests that smarter individuals have earlier first memories.”

“The study showed that* *white people were signiﬁcantly happier than members of the non-white races (black and other).”

“There is no evidence to suggest that absence is more frequent at one age rather than another.