forked from statOmics/PSLS
-
Notifications
You must be signed in to change notification settings - Fork 0
/
07_3_NHANES.Rmd
68 lines (44 loc) · 1.99 KB
/
07_3_NHANES.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
---
title: "Exercise 7.3: ANOVA on the NHANES dataset"
author: "Lieven Clement and Jeroen Gilis"
date: "statOmics, Ghent University (https://statomics.github.io)"
---
# The NHANES dataset
The National Health and Nutrition Examination Survey (NHANES) contains data
that has been collected since 1960. For this tutorial, we will make use of
the data that were collected between 2009 and 2012, for 10.000 U.S. civilians.
The dataset contains a large number of physical, demographic, nutritional and
life-style-related parameters.
# Goal
In the NHANES dataset, one of the columns is named `HealthGen`.
HealthGen is a self-reported rating of a participant's health
in general terms. HealthGen is reported for participants aged 12
years or older. It is a factor with the following levels:
Excellent, Vgood, Good, Fair, or Poor.
We want to assess if the systolic blood pressure
value (take column `BPSys1`) is equal between the five self-reported
health categories. To this end, we will use an ANOVA analysis
(if the required assumptions are met).
# Load the required libraries
```{r, message = FALSE}
```
# Data import
```{r, message=FALSE}
```
# Data Exploration
To get a more informative and intuitive visualization, you can:
1. Filter out subjects with NA values for `HealthGen` or `BPSys1`
2. Set `HealthGen` to a factor and relevel it to Poor --> Excellent
**Hint:** The second task can be achieved by using the `mutate`,
`as.factor` and `fct_relevel` functions.
```{r}
```
1. What do you observe from the data exploration?
2. How will you model the data?
3. Translate the research question in a null and alternative hypothesis
4. Which test will you use to assess the research hypothesis?
5. Formulate the assumptions of the test and assess the assumptions using
diagnostic plots.
6. If all assumptions to perform the test are met, complete the analysis and
formulate a proper conclusion. If the assumptions are not met (immediately), can
you think about the concepts we discussed in the theory to tackle this issue?