Section 1: Practicals

(i) Fish, tanks and food

An experimenter investigating the effect of different food for a species of fish places the food in tanks containing the fish. The weight increase of the fish is the response (yijk), where i,j and k are indices identifying the type of food, the tank and the fish, respectively.

  • Please, identify fixed and random effects.
  • Please, identify predictable and unpredictable (i.e. error) random effects.
  • Please, write the equation of a reasonable regression model generating the response.

Note: this example has been taken from the following reference: Casella, George (2008). Statistical design. Berlin: Springer. ISBN 978-0-387-75965-4.

(ii) Anticancer activity of carboplatin combined with nivolumab

In a preclinical in vivo experiment the efficacy of carboplatin combined with nivolumab was tested. A mouse model for NSCLC was used. The statistical design is reported in the following flow-chart and has the following characteristics:


1. The experimental groups were:
         n.1: Vehicle
         n.2: Carboplatin
         n.3: Nivolumab
         n.4: Carboplatin plus nivolumab
2. A blocked randomisation by gender and baseline tumour volume was used to ensure balanced groups and high probability to detect an antineoplastic synergy between chemotherapic (i.e. carboplatin) and immunotherapic (i.e. nivolumab) compounds. Two batches of animals were used. The first batch was randomised in April, the second batch was randomised in May.
3. The primary response, tumour volume (mm3), was assessed at baseline and daily in the subsequent two weeks. At baseline and in the first week it was assessed by the operator Mark, in the second week by the operator Peter.

  • Please, identify fixed effects.
  • Please, identify random effects.
  • Please, identify crossed and nested effects.

(iii) Systematic component of regression models

The relationship between response FEV (lung capacity) and predictors age (x1), height (x2), gender (x3), smoking status (x4) and place of residence (x5) could be described by the following systematic components:

\(\mu\) = \(\beta\)0 + \(\beta\)1x1 + \(\beta\)2x2 + \(\beta\)3x3 + \(\beta\)4x4 + \(\beta\)5x5
1/\(\mu\) = 0 + e\(\beta\)1x1 + \(\beta\)2x2(1/3) + 0 + \(\beta\)4x4 + 0
loge\(\mu\) = \(\beta\)0 + 0 + \(\beta\)2x2 + \(\beta\)3x32 + \(\beta\)4x4 + \(\beta\)5x5
\(\mu\) = \(\beta\)0 + \(\beta_1\)x1(1/2) + 0 + 0 + 0 + \(\beta\)5x5
  • Please, for each systematic component:
    1. identify the number of predictors
    2. identify the number of regression parameters
    3. is the mean FEV of the UK population analysed on natural scale?
    4. is the systematic component linear in the parameters?
    5. could the systematic component be used in a linear regression model?
  • Can you give a meaning for the parameter \(\beta\)0?

(iiii) Noisy miners and number of eucalypt trees

The data for this exercise are available in R as the data frame nminer, part of the GLMsData package1.
Lets starts by

  • installing the GLMsData package in R
  • loading the GLMsData package and the nminer data frame
  • displaying the first lines of data as follows:
# install.packages("GLMData")  # Install the GLMsData package
library(GLMsData)              # Load the GLMsData package
data(nminer)                   # Make the data set nminer available for use
head(nminer)                   # Display the first few lines of data
##   Miners Eucs Area Grazed Shrubs Bulokes Timber Minerab
## 1      0    2   22      0      1     120     16       0
## 2      0   10   11      0      1      67     25       0
## 3      1   16   51      0      1      85     13       3
## 4      1   20   22      0      1      45     12       2
## 5      1   19    4      0      1     160     14       8
## 6      1   18   61      0      1      75      6       1

The noisy miner is a small but aggressive native Australian bird. A study2 of the habitats of the noisy miner recorded the number of noisy miners (that is, the number observed; column Minerab) in two hectare transects located in buloke woodland patches with varying numbers of eucalypt trees ( column Eucs).

  • Please, plot the number of noisy miners against the number of numbers of eucalypt trees by means of a scatter plot.
plot( jitter(Minerab) ~ Eucs, data=nminer, las=1, ylim=c(0, 20),
      xlab="Number of eucalypts per 2 ha", ylab="Number of noisy miners" )

  • Consider number of noisy miners as response and numbers of eucalypt trees as predictor.
    1. Is the relationship between the mean number of noisy miners (\(\mu\)) and the numbers of eucalypt trees linear?
    2. Does the random component have a constant variance?
    3. Could the normal distribution reasonably describe the random component?
    4. Is it possible to describe the distribution of the number of noisy miners as function of numbers of eucalypt trees using a linear regression model?
    5. Please, propose a reasonable regression model generating these data

Note: this example has been taken from the following reference: Dunn, Peter K. and Gordon K. Smyth. “Generalized Linear Models With Examples in R.” (2018).
References:
1 Dunn, P.K., Smyth, G.K.: GLMsData: Generalized linear model data sets (2017). URL https://CRAN.R-project.org/package=GLMsData.
2 Maron, M.: Threshold effect of eucalypt density on an aggressive avian competitor. Biological Conservation 136, 100–107 (2007)