-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathRats_01_ex.Rmd
164 lines (108 loc) · 4.62 KB
/
Rats_01_ex.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
---
title: "Exercise 8.5: Blocking on the rat diet dataset"
author: "Lieven Clement and Jeroen Gilis"
date: "statOmics, Ghent University (https://statomics.github.io)"
output:
html_document:
code_download: true
theme: cosmo
toc: true
toc_float: true
highlight: tango
number_sections: true
---
# Background
Researchers are studying the impact of protein sources and protein levels in
the diet on the weight of rats. They feed the rats with diets of beef, cereal
and pork and use a low and high protein level for each diet type.
The researchers can include 60 rats in the experiment. Prior to the experiment,
the rats were divided in 10 homogeneous groups of 6 rats based on
characteristics such as initial weight, appetite, etc.
Within each group a rat is randomly assigned to a diet. The rats are fed during
a month and the weight gain in grams is recorded for each rat.
The researchers want to assess the effect of the type of diet and the protein
level on the weight of the rats.
In this exercise we will perform the data exploration using all diets, but,
to keep the data analysis simple we will only assess the beef and cereal diets.
# Experimental design
- There are three explanatory variables in the experiment: the factor diet type
with two levels (beef and cereal), factor protein level with levels
(low and high) and a group blocking factor with 10 levels.
- There are 6 treatments: beef-high, cereal-high, pork-high, beef-low,
cereal-low, pork-low protein.
- The rats are the experimental units (the unit to which a treatment is applied): in this design, there is a randomisation restriction: Within a block, a rat is randomly assigned to a diet.
- The rats are the observational units (the unit on which the response is measured): The weight is weighted for each rat.
- The weight gain is the response variable.
- The experiment is a randomized complete block (RCB) design
Load libraries
```{r, message=FALSE, warning=FALSE}
library(tidyverse)
```
# Data import
```{r}
diet <- read.table("https://raw.githubusercontent.com/statOmics/PSLS21/data/dietRats.txt",
header=TRUE)
head(diet)
```
# Tidy data
```{r}
diet <- diet %>%
mutate(block = as.factor(block),
protSource = as.factor(protSource),
protLevel = as.factor(protLevel)) %>%
mutate(protLevel = fct_relevel(protLevel, "l"))
```
# Data exploration
- Boxplot of the weight gain against protein source, protein level with coloring according to block
```{r}
```
- Lineplot of the weight gain against protein source, protein level with coloring and grouping according to block
```{r}
```
- Interpret the plots
# Data filtering
Filter the data to only use the beef and cereal diet.
```{r}
```
# Multivariate linear regression analysis
## Model specification
Based on the data exploration, propose a sensible regression model to analyse
the data.
## Assumptions
Check the assumptions of the linear regression model
## Hypothesis testing
Use the `summary` function to get an initial test for the parameters in the
model.
## Interpretation of the regression parameters
To facilitate the interpretation of the different parameters in our regression
model, we can make use of the `VisualizeDesign` function of the
`ExploreModelMatrix` R package. The first argument of this function is the name
of our target dataset, the second argument is a model formula, which in this
case is specified as a `~` followed by the explanatory variables in our model.
```{r, eval = FALSE}
library(ExploreModelMatrix)
ExploreModelMatrix::VisualizeDesign(..., ~ ...)$plotlist
```
After seeing this, again think about the meaning of the parameters in our model.
## Testing the overall (omnibus) effect of diet
By comparing a model containing diet effects to a model that does not have
diet effects, using anova.
```{r}
```
## Assessing the interaction effect between protein source and protein level
```{r}
```
## Assessing specific contrasts
Imagine that we are interested in assessing if there is an effect of
1. protein source in the low protein diets
2. protein source in high protein diets
3. protein level for beef diets,
4. protein level for cereal diets,
5. if the effect of the protein level differs between beef and cereal
Step 1: translate these research questions into parameters of the model (or
combinations of multiple parameters).
Step 2: Assess the significance of the contrasts using the `multcomp` package.
The contrasts are given as input in the form of symbolic
descriptions to the `linfct` argument of the `glht` function.
# Conclusion
Formulate a conclusion for the different research hypotheses.