17.1.4.3 Algorithms (CrossTabs)CrossTabsAlgorithm
CrossTabs is also called Contingency Tables. This tool is used to examine the existence or the strength of any association between variables.
CrossTabs Method
 Frequency Counts
 Marginal and Cell
 ChiSquare Tests Table
 Fisher's Exact Test Table (2 x 2 only)
 Measures of Association
 Measures of Agreement
 Odds Ratio and Relative Risk (2 x 2 only)
 CochranMantelHaenszel
Frequency Counts
Define
 are distinct values of row variable in ascending order, i.e.
 are distinct values of column variable in ascending order, i.e.
 is the frequency with respect to cell
 is subtotal of the th row
 is subtotal of the th column
 is the total number.
Marginal and Cell
Statistics

Formula and Explanation

Count


Expected Count


Row Percent


Column Percent


Total Percent


Residual


Std. Residual


Adj. Residual


ChiSquare Statistics
Statistics

Formula and Explanation

Degree of Freedom

Pearson ChiSquare



Likelihood Ratio



Linear Association

, where is the Pearson correlation coefficient.


Continuity Correction

, which is calculated only for 2 x 2 table


Fisher's Exact Test
This test is useful when some expected cell count is low (less than 5). It's calculated only for 2 x 2 table. Suppose we have the table in the following:



Subtotal/Total









Subtotal/Total




Under the null hypothesis (Independence), the count of the first cell is a hypergeometric distribution with probability given by
, .
oneSided test
The onesided test significance level is calculated by
 p(leftsided test) =
 p(rightsided test) =
TwoSided tail
The twotail significance is
where
 , if
 , if
Measures of Association
Define






 is subtotal of the th row
 is subtotal of the th column
 is the total number.
Statistics

Formula and Explanation

Standard Error

Phi Coefficient

, which is calculated for not 2 x 2 table. For a 2 x 2 table, it is equal to
The value ranges from , where ,


Cramer's V



Contingency Coefficient



Gamma



Kendall

Taub



Tauc

, where


Somer's D

CR



RC



Symmetric



Lambda

CR

, where is the largest count in ith row, and is the largest column subtotal.

,
where is the column index of , is the index of column subtotal for .

RC

,
where is the largest count in jth column, and is the largest row subtotal.

,
where is the row index of , is the index of row subtotal for .

Symmetric


where , , , and .

Uncertainty

CR

, where , and , and

, where

RC



Symmetric



Measures of Agreement
This table is calculated only when two conditions are satisfied (1) square table, i.e. , and (2) the row variable and column variable have same values.
The Kappa statistic is calculated by
The standard error is estimated by:
 .
where , ,
and .
The corresponding asymptotic standard error under the null hypothesis is given by
Another related statistic is Bowker, which is used to test for all pairs. If , the statistic is calculated as
For lager samples, is asymptotically chisquare distribution with degree of freedom .
Note that for 2 x 2 table, Bowker's test is equal to McNemar's test. So we only give Bowker's test.
Odds Ratio and Relative Risk
These statistics are calculated only for 2 x 2 table.
Odds Ratio
The Odds Ratio is calculated as
Relative Risk
The Relative Risks are given by




CochranMantelHaenszel
Define
 be the number of layers
 be the frequency in the ith row, jth column and kth layer
 be the jth column, kth layer subtotal
 be the ith row, kth layer subtotal
 be the kth layer subtotal
 be the expected frequency of the ith row jth column kth layer cell

MantelHaenszel statistic
The MantelHaenszel statistic is given by
where sgn is the sign function .
BreslowDay statistic
The BreslowDay statistic is
where .
Tarone’s Statistic
The Tarone’s Statistic is
where .
Common Odds Ratio
For a 2×2×K table, the odds ratio at the kth layer is .
Assuming that the true common odds ratio exists,taht is , MantelHaenszel's estimator of the common odds ratio is
The asymptotic variance for is:
The lower confidence limit(LCL) and upper confidence limit(UCL) for is:
 and
