R programming language: The goal of our test is to determine whether the conversion rate of the test group is different

Post by **answerhappygod** » Wed Apr 27, 2022 3:48 pm

R programming language:
The goal of our test is to determine whether the conversion rate
of the test group is different than the
conversion rate of our control group. Conversion rate in this case
is defined as

: R Programming Language The Goal Of Our Test Is To Determine Whether The Conversion Rate Of The Test Group Is Different 1 (9.32 KiB) Viewed 44 times

Our hypothesis will test whether the difference in conversion
rate or proportion for the test group and control
group is statistically significant when α = 0.01.

: R Programming Language The Goal Of Our Test Is To Determine Whether The Conversion Rate Of The Test Group Is Different 2 (3.99 KiB) Viewed 44 times

The data we will be using for the following exercises is
test_control.csv. This data represents a simple
random sample of 15,000 individuals served PSA ads and 15,000
individuals served a branded digital video
ads. The data also contains an indicator for whether an individual
purchased from our retailer after viewing
the ad.
data:
https://drive.google.com/drive/folders/ ... sp=sharing

1. What variables are available in our data set? List out the
column names and describe the data type of
each variable.
2. How are our test and control samples defined in our data
set?
3. What proportion of individuals from the test group purchased on
the retailer’s website after viewing an
ad? What proportion of individuals from the control group purchased
on the retailer’s website after
viewing an ad?
4. For each of the variables [gender, age, income] create a bar
plot to explore the distribution of
demographic information in our samples.
5. Create a figure with two bar plots (one for the test group
and one for the control group) for age.
Describe the difference in the distribution between the test and
control groups. Compare the percentage
of each category between our test and control groups.
6. Create a figure with two bar plots (one for the test group and
one for the control group) for gender.
Describe the difference in the distribution between the test and
control groups. Add the percentage of
each category to your plots. Why might this variable be important
to our analysis?
7. Create a figure with two bar plots (one for the test group and
one for the control group) for income.
Describe the difference in the distribution between the test and
control groups. Compare the percentage
of each category between our test and control groups.
8. How might the differences in the distributions of the
categorical variables analyzed in #5 - #7 impact
our analysis? Is it possible that our two samples may represent
different types of shoppers?
Hypothesis Test
9. What is the difference in the conversion rate for the test and
control groups?
The confidence interval for the difference between two proportions
(when n > 30) is defined as

: R Programming Language The Goal Of Our Test Is To Determine Whether The Conversion Rate Of The Test Group Is Different 3 (7.82 KiB) Viewed 44 times

10. Using the equation above, write a function to calculate the
confidence interval for the difference between
two proportions. Your function should include three arguments: p1,
p2, n1, n2 and Z. Your function
should return the confidence interval for the difference in two
proportions at a given confidence level (in
our example, Z = 2.575 when α = 0.01) Round your results to the
first five decimal places.
11. Calculate the confidence interval for the difference between
the conversion rates for our test and control
groups when α = 0.01 (Z = 2.575) using your function. Does this
confidence interval include zero?
What are the implications for the difference between two means when
the confidence interval does not
include zero?
12. Similar to the t.test() function in R, the prop.test() function
can be used for testing the null
hypothesis that the proportions (probabilities of success) in
several groups are the same, or that they
equal certain given values. A chi-square test for equality of two
proportions is exactly the same test as
a z-test (chi-squared distribution with one degree of freedom is
just that of a normal deviate, squared).
What are the arguments for the function prop.test()?
13. Noting that the arguments x and n require vectors of values,
use the prop.test() function to test our
hypothesis that there is a statistically significant difference
between the conversion rates of our test and
control groups.
14. Interpet each output of prop.test. Explain your p-value in the
context of our hypothesis. Is the
difference in the conversion rates for the test and control groups
statistically significant?
15. Use the function pchisq(x, df=1) to try to understand the
X-squared score value in the output of
prop.test(). What do the “p” functions for distributions calculate
in R? Subtract the value calculated
using pchisq from 1. What does this value represent?
Conclusion
16. In a few sentences, describe your interpretation of the results
we found in this assignment. How might
the demographic data we observed for our test and control groups
impact the difference in the two
conversion rates? Do you still believe that the results of our
hypothesis test is valid? Justify your
answer.
Conversion Rate = Individuals in Group Who Purchased Total Individuals in Group
Ho : Ptest - Pcontrol = 0 H: Ptest - Pcontrol 70
Plest (1- Plest) Pcontrol ® (1 - Pcontrol) Plest - Pcontrol + Za/21 + ncontrol ntest