17.1.3.2 Interpreting Results of Cross TabulationcrosstabResults
Contingency Table
Contingency Table gives the information about the frequency distribution of the variables, including counts, percentages and residuals.
Counts, Row%, Col% and Total% helps user to compare the levels across the groups.
Residuals are statistics to test the independency of the column and row variable.The more the value is close to zero, the more likely the column and row variable has no association
Adjusted residual is the most useful residual as it is standardized to N(0,1), for comparing between cells. If the value is larger than 1.96 or less than 1.96, the observed count is significantly larger than or less than expected. The larger the value is, the more likely the column variable is associate with the row variable.
ChiSquare Tests
The ChiSquare tests provides results to test the hypothesis that the row and column variables are independent.
ChiSquare Tests Table displays ChiSquare, DF and Prob > ChiSq(the pvalue).
If Prob > ChiSq is less than the significant level, we can say at the significant level, there is significant evidence of association between the row and column variables. Else, we can say at the significant level, there is no significant evidence of association between the row and column variables.
Four tests are available.
 Pearson ChiSquare:
 It is the most widely used ChiSquare test. The test statistic is calculated by summing the squared deviations between observed and expected counts divided by expected counts. It has an approximately chisquared distribution under large samples. So the test result is made by reference to the chisquared distribution.
 Likelihood Ratio
 Likelihood Ratio builds on the likelihood of the data under the null hypothesis of independence. It is used to compare the goodness of fit of the null model with the alternative model. The test statistic also has an approximately chisquared distribution. It usually comes to similar result as Pearson ChiSquare.
 Continuity Correction
 It is also referred to as Yates Continuity Correction, and is available only for a 2*2 table in Origin. If the expected number of observations in any category is too small(e,g,less than 5), the asymptotic chisquared distribution is not quite correct, Pearson ChiSquare and Likelihood Ratio's results cannot be trusted and the Continuity correction is recommended. It is similar to the Pearson chisquare, except that it is adjusted for the continuity of the chisquared distribution.
 Linear Association
 It is available only for numeric data. The ChiSquare tests above do not take the ordering of the rows or columns into account, but Linear Association can do it. It is based on the Pearson correlation coefficient, and it has an approximately chisquared distribution on 1 df.
Notes: if the expected number of observations in any category is too small(e,g,less than 5), Pearson ChiSquare and Likelihood Ratio's results cannot be trusted.

Fisher's Exact Table
If the expected number of observations in any category is too small(e,g,less than 5), ChiSquare Tests may not be appropriate while Fisher's Exact Table is recommended.
Three tests are available, leftsided, rightsided and twosided test. It enable user to know which A*B level combination is more likely to occur. You can look at the Conclusion column for the details. (A is for the row variable and B is for the column variable)
Notes: Note that Fisher's Exact test is available only for a 2*2 table

Measures of Association
Please look at the introduction page for what situation the statistics should be used in
Measures for Nominal Variables
 Phi
 For a 2*2 table, the range of the Phi is [1,1]. For tables larger than 2*2, the range of the Phi is [0,M] (See algorithm page for M]). A larger value indicates the stronger association of two variables.
 Contingency coefficient
 The range of value is [0,1). A larger value indicates the stronger association of two variables.
 Cramer's V
 The values range from 0 to 1. A larger value indicates the stronger association of two variables.
 Lambda
 Please look at the notes below for more information of CR, RC and Symmetric. A larger value indicates a stronger association
 Uncertainty Coefficient
 Please look at the notes below for more information of CR, RC and Symmetric. A larger value indicates a stronger association
Notes:
 CR:
 The row variable(R) is regarded as an independent variable, while the column variable(C) is regarded as dependent variable. The value indicates by what percentage do we reduce our error when using the R to predict the C
 RC
 The column variable(C) is regarded as an independent variable, while the row variable(R) is regarded as dependent variable. The value indicates by what percentage do we reduce our error when using the C to predict the R
 Symmetric:
 The variables are not be classified as independent and dependent. That is, it can only to measure the strength of association between the two variables but it can not predict how one variable affects another one

Measures for Ordinal Variables
 Gamma
 Take a range of values from 1 to +1. If it is positive, this means that the increase of one variable is likely to cause the increase of the other variable. While a negative value indicates a reverse relationship. The more the values is close to 0, the weaker the relationship is.
 Kendall's taub and tauc
 Similar to Gamma and with same results explanation.
 Somer's D
 Please look at the notes below for more information of CR, RC and Symmetric. A larger value indicates a stronger association
Notes:
 CR:
 The row variable(R) is regarded as an independent variable, while the column variable(C) is regarded as dependent variable. The value indicates the strength of association while C depends on R.
 RC
 The column variable(C) is regarded as an independent variable, while the row variable(R) is regarded as dependent variable. The value indicates the strength of association while R depends on C.
 Symmetric:
 The variables are not be classified as independent and dependent. That is, it can only to measure the strength of association between the two variables but it can not indicate how one variable affects another one

Agreement Statistic
Please look at the introduction page for what situation the statistics should be used in
Kappa Test
Kappa Test table displays the value of Kappa, standard error(SE), lower confidence limit(LCL) and upper confidence limit(UCL),Z value, Prob>Z(the pvalue for a onesided test for Kappa),Prob>Z(the pvalue for a twosided test for Kappa).
From the Kappa value, user will know the level of agreement the two rater agree to each other.
 <=0: no agreement
 0  0.4: poor agreement
 0.4  0.59: fair agreement
 0.6  0.74: good agreement
 > 0.75: excellent agreement
 1: complete agreement
In the mean time, Kappa Test table also provide results for testing the hypothesis that Kappa equals to zero.
 If "Prob>Z" less than significant level, we can say that at the significant level, Kappa is significantly larger than zero. Else we can say at the significant level,Kappa is significantly equals to zero.
 If "Prob>Z" less than significant level, we can say that at the significant level, Kappa is significantly different from zero. Else we can say at the significant level,Kappa is significantly equals to zero.
Bowker's Test
Bowker's Test table displays ChiSquare value, its DF and "Prob>ChiSq"(pvalue for the Bowker's test). It tests the equality of proportion in all matchedpairs cells that are symmetric around the diagonal ()
 If "Prob>ChiSq" less than significant level, we can say that at the significant level, the frequency counts table is significantly asymmetric, that is, . Else, we can say that at the significant level, the frequency counts table is Not significantly asymmetric, that is,
Odds Ratio & Relative Risk
Odds Ratio & Relative Risk is available only for a 2*2 table. Odds Ratio measures the ratio of the odds that an event or result will occur to the odds of the event not happening. Relative Risk measures the ratio of the odds of an event occurring in an group to the odds of the event occurring in a comparison group.
Odds Ratio & Relative Risk table displays the value, lower confidence limit(LCL) and upper confidence limit(UCL). Supposed Relative Risk =RR=P(ab)/P(ac), If RR=1, we can say that the probability of causing outcome a is the same in b and c; else if RR>1, we can say that the probability of causing outcome a is greater in b than in c;else, we can say that the probability of causing outcome a is smaller in b than in c.
CMH Table
Results of CochranMantelHaenszel tests. It is to test whether there is any relationship between the row and column variable after controlling for the layer variable
Conditional Independence Test
It is tested by MantelHaenszel statistic. The MantelHaenszel statistic tests the hypothesis that there is no significant association between the row and column variable, by controlling for the layer variable. Conditional Independence Test table displays ChiSquare value, its DF and "Prob>ChiSq"(pvalue for the Conditional Independence Test).
 If "Prob>ChiSq" less than significant level, we can say that at the significant level, there is significant association between the row and column variable in at least one layer.Else,we can say that at the significant level, there is no significant association between the row and column variable in any layer.
Odds Ratio Homogeneity Tests
It is tested by BreslowDay statistic and Tarone's statistic. They all test the hypothesis that odds ratio between the row and column variable is the same at each level of layer variable.
Odds Ratio Homogeneity Tests table displays ChiSquare value, its DF and "Prob>ChiSq"(pvalue for the Odds Ratio Homogeneity Tests).
For BreslowDay statistic and Tarone's statistic,
 If "Prob>ChiSq" less than significant level, we can say that at the significant level, odds ratio is significantly different among layers. Else,we can say that at the significant level, odds ratio is NOT significantly different among layers.
Common Odds Ratio
The common odds ratio across layer variable is estimated by MantelHaenszel estimate. Common Odds Ratio table displays estimate of common odds ratio, "ln(estimate)" (The natural log of the estimated common odds ratio) and its standard error, lower confidence limit(LCL) and upper confidence limit(UCL).
Mosaic Plot
A mosaic plot is divided into rectangles, so that the area of each rectangle is proportional to the proportions of the Y variable in each level of the X variable.
