R Resources for Chapter 15 (Inference for Counts)
Goodness-of-Fit Tests
The chisq.test
function can be used with a vector of counts and a vector of expected proportions for a goodness of fit test.
> counts <- c(102, 68, 30) # make a vector with the data
> # Chi-sq test
> results1 <- chisq.test(counts, p=c(.6, .3, .1))
> results1 # results
Chi-squared test for given probabilities
data: counts
X-squared = 8.7667, df = 2, p-value = 0.01248
The expected counts under the null hypothesis can be obtained
> results1$expected # expected counts
[1] 120 60 20
The critical value from the Chi-square distribution is obtained using the qchisq
function with the
area to left (1-alpha) and the degrees of freedom.
> qchisq(.95,2) # Critical value for alpha=0.05, df=2
[1] 5.991465
Test of Homogeneity or Independence
There are two ways you may want to use to enter a two-way table with count data. If you download data from MyStatLab,
you will need to remove column and row totals (if given) and save the data as a comma deliminated .csv file. Once this
is done, you can use read.csv
and the as.matrix
functions to obtain the matrix. The row.names=1
option specifies
that the first column is used as row names and not as part of the matrix.
> # Download data from MyStatLab, remove totals if given, and save file as .csv
> example <- as.matrix(read.csv("~/Downloads/data12-3.csv", row.names=1))
RStudio is likely to give you a warning message which you can ignore.
Warning message:
In read.table(file = file, header = header, sep = sep, quote = quote, :
incomplete final line found by readTableHeader on '~/Downloads/data12-3.csv'
You can check to see that the matrix looks the way you expect:
> example
Less_than_30 X30.55 X56_or_older
In-town_branch 21 39 40
Mall_branch 29 51 20
To perform the Chi-squared test:
> results3 <- chisq.test(example)
> results3
Pearson's Chi-squared test
data: example
X-squared = 9.5467, df = 2, p-value = 0.008452
As before, you can see obtain the expected counts.
> results3$expected
Less_than_30 X30.55 X56_or_older
In-town_branch 25 45 30
Mall_branch 25 45 30
Alternatively, you can enter a matrix with the matrix
function. List the data column by column and
specify the number of rows and columns using the nrow
and ncolumn
options. Note that the matrix
obtained has exactly the same data as above, but default values are used for row and column names.
> Alternative <- matrix(c(21, 29, 39, 51, 40, 20), nrow=2, ncol=3)
> Alternative
[,1] [,2] [,3]
[1,] 21 39 40
[2,] 29 51 20
This exact same results are obtained using the matrix without row and column names:
> results3 <- chisq.test(Alternative)
> results3
Pearson's Chi-squared test
data: Alternative
X-squared = 9.5467, df = 2, p-value = 0.008452
> results3$expected
[,1] [,2] [,3]
[1,] 25 45 30
[2,] 25 45 30
If you want row and column names, you can add them with the rownames and colnames function. For example:
> rownames(Alternative) <- c("In-town", "Mall")
> colnames(Alternative) <- c("<30", "30-55", ">55")
> Alternative
<30 30-55 >55
In-town 21 39 40
Mall 29 51 20