In this practical, we will use several ‘real life’ data sets to demonstrate some of the concepts you have seen in the lectures. We will guide you through how to analyse these data sets and the kinds of questions you should be asking yourself when faced with similar data.

To explore the data sets and answer the questions in this practical
we will be using web applications we have developed in **R** using the **Shiny** framework.
**R** is a freely available statistical programming
language that is popular within academic communities and commercial
organizations. The functionality offered by R compares favourably with
other statistical packages but R has a steep learning curve and involves
writing code rather than use of a point-and-click user interface. We
have decided for this course to focus on the statistical concepts rather
than any particular statistical package. The online apps will allow you
to carry out the statistical analyses without needing to learn R.

We will be using the following Shiny web applications for the exercises in this practical.

The data sets we will be using are pre-loaded within the Shiny apps
but are also available for download using this **link**
in case you wish to do the exercises using your favourite statistics
package.

**Part
(i):**

The tab **Estimated coverage of Student’s CI** in the
**central-limit-theorem**
app displays the confidence intervals of 100 simulated datasets.

Assuming that the simulated data are normally distributed, what is the probability of the

**true**mean belonging to a confidence interval?Let X denote a random variable that equals 1 if the

**true mean belongs to the confidence interval**and 0 otherwise. What is the distribution of X?

**Part
(ii):**

Using the **central
limit theorem** app, answer the following questions:

Simulate

**1000 samples**of**size n = 10**of**Poisson**random variates, first assuming a**mean of 0.25**, and then assuming a**mean of 100**. Compare the coverage level of Student’s confidence intervals for the mean of these two simulation sets: How do you explain that the latter is better than the first one?Now consider

**zero-inflated Poisson**variates with a**mean of 30**and a**10% probability of belonging to the clump-at-zero**. Can you think of a random variable having such a distribution? How large should the sample size be for the Student’s confidence intervals to have good properties?A student lost a few points in the statistic exams as the use of Student’s confidence intervals for the probability of success of a

**Bernoulli**variable with**pi = 40%**and a**n=100**was not considered as suitable. Should they contact the University to dispute their mark?

Use the **statistical
tests app** to perform one-sample location tests.

For each exercise, select the corresponding numbered data set,
e.g. ‘1.1 Effect of disease X on height’, in the drop-down list on the
*Data input* tab page. The data will be shown in a table on the
same tab page for you to familiarize yourself with. Then select the
*One sample test* tab page and set the hypothesized mean.

You can visualize the data in the *Plots* tab, check the
assumptions for a parametric test on the *Assumptions* tab, and
carry out the test you decide is most appropriate on the *Statistical
tests* tab. The *Summary statistics* tab contains a number of
useful descriptive statistics about the data, including measures of the
central tendency (mean, median) and the variability in the data.

A scientist knows that the mean height of females in England is 165cm and wants to know whether her patients with a certain disease “X” have heights that differ significantly from the population mean - we will use a one-sample t-test to test this.

Click on the *Data input* tab from the navigation bar at the
top of the page. Then select the disease X data set from the drop-down
list. Look at the values for height for the patients in this data set in
the table shown below.

**Question:** What
are the null and alternative hypotheses?

Click on the *One sample test* tab in the navigation bar and
select the *Plots* tab to view a box plot and histogram of the
patient heights.

There are options for changing the display of these plots in the side panel on the left. For example, you can overlay points or a density plot on the box plot or use a different number of bins for the histogram and overlay a normal distribution.

**Questions:** Do the
data look normally distributed? Based on the plots, is the parametric,
one-sample t–test appropriate?

A *Q-Q plot* can also be helpful for assessing the normality
of the data and can be viewed on the *Assumptions* tab.

We are interested in knowing whether the mean height in our sample of
patients with disease X is different from that of the general
population. Perform a **one-sample t-test** on the
*Statistical test* tab.

**Questions:** What
is the mean height in your sample? What is your value of t? What is the
p-value? How do you interpret the p-value?

In blood plasma cancer, there is an increase in blood vessel
formation in the bone marrow. A stem cell transplant can be used as a
treatment for blood plasma cancer. The *Difference* column in
this data set reports the differences in bone marrow microvessel density
after treatment for 7 patients.

We are interested in seeing whether there is a decrease in the bone marrow microvessel density after treatment with a stem cell transplant.

Select the blood vessel formation data set (1.2) in the app on the
*Data input* page.

**Question:** What
are the null and alternative hypotheses?

View the histogram and box plot of the paired differences on the
*Plots* tab in the ’One sample test* page.

**Questions:** Do the
differences look normally distributed? Is the parametric t–test
appropriate?

**Question:** What is
the hypothesized mean for the null hypothesis in this case?

We are interested in seeing whether there is a decrease in the bone marrow microvessel density after treatment with a stem cell transplant.

**Question:** Is this
a one-tailed or two-tailed test?

Now select the appropriate options in the *Statistical test*
tab in order to perform the analysis.

**Questions:** What
is the mean difference? What is your value of t? What is the p-value?
How do you interpret the p-value?

Use the **statistical
tests app** to perform two-sample location tests.

For each exercise, select the corresponding numbered data set,
e.g. ‘2.1 Biological process duration’, in thedrop-down list on the
*Data input* tab page. The data will be shown in a table on the
same tab page. Then select the *Two sample test* tab page and
either select the categorical variable (the column that defines which
group each observation is in), the two groups to compare and the
numerical variable that is being compared for the two groups, or select
two numerical columns if the data contained paired observations for the
same subject.

You can visualize the data for the two groups in the *Plots*
tab, check the assumptions for a parametric test on the
*Assumptions* tab, and carry out the test you decide is most
appropriate on the *Statistical tests* tab.

This data set contains durations of a biological process for two
samples of wild type and knock-out cells. We are interested in seeing if
there is a difference in the durations for the two types of cells. We
shall use an **independent t-test** to compare the two cell
types.

Click on the *Data input* tab from the navigation bar at the
top of the page. Then select the biological process duration data set
from the drop-down list. Each row contains one observation. The cell
type is given in the *Group* column and the duration measurement
is given in the *Time* column.

**Question:** What
are the null and alternative hypotheses?

Click on the *Two sample test* tab in the navigation bar and
make sure the ‘Paired observations’ checkbox is not selected. The app
detects the group column and numerical variable automatically.

Select the *Plots* tab to view box plots and histograms of the
biological process durations of the two groups.

**Questions:** Do the
data look normally distributed for each cell type? Is the independent
t-test appropriate? What statistics are appropriate to report the
location (mean or median) and spread (standard deviation or
interquartile range, IQR) of the data?

In order to apply the correct statistical test, we need to test to
see if the variances of the two groups are comparable. We can run an
**F test** to compare the two variances under the
*Assumptions* tab. However, it is often easier to eyeball the box
plots of the data to decide if the variances are similar.

**Questions:** What
do you conclude from the p-value of the F test? Does this agree with
your impression of the variances from the box plot and histograms? How
does it influence what two sample test to use?

Now use the appropriate two-sample t-test to compare the durations of the two groups.

**Questions:** What
is your value of the test statistic? What is the p-value? How do you
interpret the p-value?

In blood plasma cancer, there is an increase in blood vessel formation in the bone marrow. A stem cell transplant can be used as a treatment for blood plasma cancer. The bone marrow microvessel density was measured before and after treatment for 7 patients with blood plasma cancer.

We are interested in seeing whether there is a decrease in the bone marrow microvessel density after treatment with a stem cell transplant. We will use a paired two-sample t-test to compare the before and after bone marrow microvessel densities.

Click on the *Data input* tab from the navigation bar at the
top of the page. Then select the blood vessel formation (2.2) data set
from the drop-down list. Each row contains before and after observations
for a single patient - these are paired observations.

**Question:** What
are the null and alternative hypotheses?

Click on the *Two sample test* tab in the navigation bar and
make sure the ‘Paired observations’ checkbox is selected. The app
detects the two numerical variables, Before and After,
automatically.

Select the *Plots* tab to view box plots and histograms of the
microvessel densities for the two groups and for the differences
following treatment.

Note that it is the differences that we’re interested in for these paired observation data so it is these that need to meet the normality assumption for a parametric t-test to be appropriate.

Check the test statistic and p-value of the test.

**Question:** How is
it that these match the ones from Exercise 1.2?

A gene expression study was performed on patients categorised into positive and negative estrogen receptor (ER) status groups. It is well known that ER positive patients have more treatment options available and better prognosis.

The gene NIBP was measured as part of this study. We are interested in seeing if the expression level of this gene is different between the ER positive and ER negative patients.

**Question:** What
are the null and alternative hypotheses?

Now conduct an independent two-sample t-test to see if there is a difference in expression between the two groups.

**Question:** What is
the p-value from the test? Do we achieve statistical significance at the
0.05 level?

Look closely at data distribution, calculated means for each group and the estimated confidence interval.

**Questions:** Is the
finding likely to hold biological significance? Would you be willing to
put further resources into validating the finding?

This data set contains data on vitamin D levels for subjects with (“Y”) and without (“N”) fibrosis. We are interested in seeing if there is a difference in these levels between the two groups.

**Question:** State
the null and alternative hypotheses.

Examine the distribution of the data.

**Question:** Why
doesn’t a parametric analysis seem appropriate?

By selecting a **non-parametric test** in the
*Statistical tests* tab you will see the results of a
**Wilcoxon rank-sum test**, also known as the
**Mann-Whitney U test**.

**Question:** How do
you interpret the result of the test?

Dr D. R. Peterson of the Department of Epidemiology, University of Washington, collected the data on the birth weights of each of 20 dizygous twins. One twin suffered Sudden Infant Death Syndrom (SIDS) and the other twin did not. The hypothesis to be tested is that the SIDS child of each pair had a lower birth weight.

**Question:** State
the null and alternative hypothesises.

**Questions:** Should
the test be one-sided or two-sided? Would it be appropriate to treat
these as paired or independant samples?

Decide whether to run a parametric or non-parametric test.

**Question:** How do
you interpret the result of the test?

In this section, we invite you to form small groups. Each group will be assigned one of the exercises.

At the end of the time assigned for the exercise we will go through each of the problems in turn and invite a representative of each group to present the problem to the rest of the class along with the analysis (descriptive analysis, statistical tests) the group felt was most appropriate and any conclusions made.

If time allows, groups may wish to familiarize themselves with some of the other exercises so that they can contribute to the presentations made by other groups.

Darwin (1876) studied the growth of *pairs* of zea may (aka
corn) seedlings, one produced by cross-fertilization and the other
produced by self-fertilization, but otherwise grown under identical
conditions. His goal was to demonstrate the greater vigour of the
cross-fertilized plants. The data recorded are the final height (inches,
to the nearest 1/8th) of the plants in each pair.

*Is there evidence to support
the hypothesis of greater growth in cross-fertilized
plants?*

In the history of data visualization, Florence Nightingale is best remembered for her role as a social activist and her view that statistical data, presented in charts and diagrams, could be used as powerful arguments for medical reform.

After witnessing deplorable sanitary conditions in the Crimea, she wrote several influential texts (Nightingale, 1858, 1859), including polar-area graphs (sometimes called “Coxcombs” or rose diagrams), showing the number of deaths in the Crimean from battle compared to disease or preventable causes that could be reduced by better battlefield nursing care.

Her Diagram of the Causes of Mortality in the Army in the East showed that most of the British soldiers who died during the Crimean War died of sickness rather than of wounds or other causes. It also showed that the death rate was higher in the first year of the war, before a Sanitary Commissioners arrived in March 1855 to improve hygiene in the camps and hospitals.

*Do the data support the
claim that deaths due to avoidable causes decreased after a change in
regime?*

The addition of bran to the diet has been reported to benefit patients with diverticulosis. Several different bran preparations are available, and a clinician wants to test the efficacy of two of them on patients, since favourable claims have been made for each. Among the consequences of administering bran that requires testing is the transit time through the alimentary canal. By random allocation the clinician selects two groups of patients aged 40-64 with diverticulosis of comparable severity. Sample 1 contains 15 patients who are given treatment A, and sample 2 contains 12 patients who are given treatment B.

*Does transit time differ in
the two groups of patients taking these two
preparations?*

Consider a clinical investigation to assess the effectiveness of a new drug designed to reduce repetitive behaviors in children affected with autism. If the drug is effective, children will exhibit fewer repetitive behaviors on treatment as compared to when they are untreated. A total of 8 children with autism enroll in the study. Each child is observed by the study psychologist for a period of 3 hours both before treatment and then again after taking the new drug for 1 week. The time that each child is engaged in repetitive behavior during each 3 hour observation period is measured. Repetitive behavior is scored on a scale of 0 to 100 and scores represent the percent of the observation time in which the child is engaged in repetitive behavior. For example, a score of 0 indicates that during the entire observation period the child did not engage in repetitive behavior while a score of 100 indicates that the child was constantly engaged in repetitive behavior.

*Is there statistically
significant improvement in repetitive behavior after 1 week of
treatment?*

CD4 cells are carried in the blood as part of the human immune system. One of the effects of the HIV virus is that these cells die. The count of CD4 cells is used in determining the onset of full-blown AIDS in a patient. In this study of the effectiveness of a new anti-viral drug on HIV, 20 HIV-positive patients had their CD4 counts recorded and then were put on a course of treatment with this drug. After using the drug for one year, their CD4 counts were again recorded.

*Do patients taking the drug
have increased CD4 counts?*

Drunk driving is one of the main causes of car accidents. Interviews with drunk drivers who were involved in accidents and survived revealed that one of the main problems is that drivers do not realize that they are impaired, thinking “I only had 1-2 drinks… I am OK to drive.”

A sample of 100 drivers was chosen, and their reaction times in an
obstacle course were measured *before* and *after*
drinking two beers. The purpose of this study was to check whether
drivers are impaired after drinking two beers.

*Does drinking beer alter the
reaction time of the driver?*

Laureysens et al. (2004) measured metal content in the wood of 13 poplar clones growing in a polluted area, once in August and once in November. Concentrations of aluminum (in micrograms of Al per gram of wood) are shown below.

*Is there any evidence for an
increase in pollution between November and August?*

The 2008-09 nine-month academic salaries for Assistant Professors, Associate Professors and Professors in a college in the U.S. were collected as part of the on-going effort of the college’s administration to monitor salary differences between male and female faculty members.

*Is there evidence that
female professors are paid differently to their male
counterparts?*