The exercise uses a more realistic dataset, building on the patients data frame we’ve already been working with.
The patients are all part of a diabetes study and have had their blood glucose concentration and diastolic blood pressure measured on several dates.
This part of the exercise combines grouping, summarisation and joining operations to connect the diabetes study data to the patients table we’ve already been working with.
Read the data from the file diabetes.txt into a new object.
## # A tibble: 1,316 x 4
##    ID        Date       Glucose    BP
##    <chr>     <date>       <dbl> <dbl>
##  1 AC/AH/001 2011-03-07     100    98
##  2 AC/AH/001 2011-03-14     110    89
##  3 AC/AH/001 2011-03-24      94    88
##  4 AC/AH/001 2011-03-31     111    92
##  5 AC/AH/001 2011-04-03      94    83
##  6 AC/AH/001 2011-05-21     110    93
##  7 AC/AH/001 2011-06-24     105    79
##  8 AC/AH/001 2011-07-11      88    86
##  9 AC/AH/001 2011-07-11     101    92
## 10 AC/AH/001 2011-07-13     112    88
## # … with 1,306 more rows
The goal is to compare the blood pressure of smokers and non-smokers.
First, calculate the average blood pressure for each individual in the diabetes data frame.
Now use one of the join functions to combine these average blood pressure measurements with the patients data frame containing information on whether the patient is a smoker.
Finally, calculate the average blood pressure for smokers and non-smokers on the resulting, combined data frame.
## # A tibble: 2 x 2
##   Smokes     MeanBP
##   <chr>       <dbl>
## 1 Non-Smoker   82.0
## 2 Smoker       84.6
Can you write this whole operation as a single dplyr chain?
## # A tibble: 2 x 2
##   Smokes     MeanBP
##   <chr>       <dbl>
## 1 Non-Smoker   82.0
## 2 Smoker       84.6
In these exercises we look at adjusting the scales.
Using the patient dataset from earlier, generate a scatter plot of BMI versus Weight