Problem 2: Use RStudio software to attempt the question The data were collected from a group of workers in the cotton in
Posted: Wed May 11, 2022 5:36 am
Problem 2: Use RStudio software to attempt the
question
The data were collected from a group of workers in the cotton
industry to assess the
prevalence of the lung disease byssinosis among these workers. This
disease is caused by
long term exposure to particles of cotton, hemp, flax and jute
working in this type of
environment. It can result in asthma-like symptom which can lead to
death among
sufferers. The response variable y is binary and refers to number
of workers suffering
(response = yes) and not suffering (response = no) and the
predictors are:
xl = dustiness of the workplace (1 = high, 2 = medium, 3 =
low)
x2 = race ( 1 = European, 2 = other)
x3 = sex ( 1 = male, 2 = female)
x4 = smoking history (1 = smoker, 2 = nonsmoker)
x5 = length of employment in the cotton industry
(1 = less than 10 years, 2 = between 10 and 20 years,
3 = more than 20 years)
Notice that a l l five predictors are qualitative variables and the
responses are entered
in the event/trial format.
a) Fit a logistic regression model to the data set and discuss
which of the
predictors have a significant effect on the presence of byssinosis.
State the
final model.
# Note: you need to convert into Binary
numbers
question
The data were collected from a group of workers in the cotton
industry to assess the
prevalence of the lung disease byssinosis among these workers. This
disease is caused by
long term exposure to particles of cotton, hemp, flax and jute
working in this type of
environment. It can result in asthma-like symptom which can lead to
death among
sufferers. The response variable y is binary and refers to number
of workers suffering
(response = yes) and not suffering (response = no) and the
predictors are:
xl = dustiness of the workplace (1 = high, 2 = medium, 3 =
low)
x2 = race ( 1 = European, 2 = other)
x3 = sex ( 1 = male, 2 = female)
x4 = smoking history (1 = smoker, 2 = nonsmoker)
x5 = length of employment in the cotton industry
(1 = less than 10 years, 2 = between 10 and 20 years,
3 = more than 20 years)
Notice that a l l five predictors are qualitative variables and the
responses are entered
in the event/trial format.
a) Fit a logistic regression model to the data set and discuss
which of the
predictors have a significant effect on the presence of byssinosis.
State the
final model.
# Note: you need to convert into Binary
numbers