Chi-square test – Types, Concepts, Examples

Chi-square goodness of fit - Tossing coin

The Chi-square (χ2) test is a statistical test used to determine whether the distribution of observed data is consistent with the distribution of data expected under a particular hypothesis. The Chi-square test can be used to compare two distributions, or to assess the goodness of fit of a given distribution to observed data. In this blog post, we will discuss the types of Chi-square tests, the concepts behind them, and how to perform them using Python / R. As data scientists, it is important to have a strong understanding of the Chi-square test so that we can use it to make informed decisions about our data. We will also provide some examples so that you can see how they work in practice.

What is a Chi-square (χ2) test and what are its uses?

The Chi-square test is a statistical test used to find the truth about the relationship between the categorical variables and whether the variance found in the observation of categorical variable is statistically significant. It is based on the Chi-square statistic, which is a function of the difference between expected and observed values. For example, the Chi-square test could be used to evaluate whether the outcome from tossing a coin or a dice 100 times is statistically significant. The null hypothesis for the Chi-square test is that there is no difference between occurrence count of categorical variables with the expected value. In other words, the variance found between the observed and expected value is merely random (not statistically significant). If the Chi-square statistic is significant, then the null hypothesis is rejected and the alternative hypothesis is accepted. The alternative hypothesis states that there is a difference between the two variables. The Chi-square test is used to compare two variables in a contingency table. A contingency table is a table that shows the frequencies of two or more categorical variables. The Chi-square test can also be used to compare more than two variables in a multi-way contingency table.

The Chi-square value is calculated by taking the sum of the squared differences between the observed and expected values, divided by the expected values.

Chi-square statistics = sum of ((observed value – expected value)^2 / expected value)

If the Chi-square value is larger than the critical value, it indicates that there is a significant difference between the observed and expected data and hence, the null hypothesis can be rejected.

Chi-square tests can be used to test for relationships between categorical variables, to assess the goodness of fit of a model, or to test for independence between two variables. When used properly, the Chi-square test can be a powerful tool for understanding data.

Different types of Chi-square tests, and what do they measure?

Chi-square tests can be used for both Goodness of Fit tests and Independence tests. Goodness of Fit tests are used to determine whether the variance between observed and expected frequencies of occurrence of categorical variables are merely random / accidental or they are statistically significant. Chi-square test for test of independence is used to determine whether two variables are independent of each other or they are related. The type of test used will depend on the data being collected. Chi-square tests are based on the assumption that the data are randomly distributed. This means that each category has the same chance of being selected. 

The Chi-square statistic is used to calculate the p-value for both goodness of fit and test of independence. The p-value is used to determine whether the null hypothesis should be rejected or not. The null hypothesis is rejected if the p-value is less than the alpha level, which is typically 0.05. If the null hypothesis is rejected, then there is a significant difference between the expected and observed distributions. If the null hypothesis is not rejected, then there is not a significant difference between the expected and observed distributions. 

The chi-square test is not always accurate, and there are some situations where it should not be used. For example, if any of the expected counts are less than five, or if there is more than 20% missing data, then the chi-square test should not be used. In addition, the chi-square test does not tell us about causation – it only tells us about association. For example, if we find that there is a relationship between gender and political affiliation using the chi-square test, we cannot conclude that one causes the other. We can only conclude that there is an association between gender and political affiliation.

Chi-square goodness of fit test

Chi-square goodness of fit tests are statistical tests which are used to evaluate whether the variance observed in values of categorical variables is due to chance / randomness or whether it is statistically significant. Chi-square goodness of fit tests are used when the expected frequencies are known. Chi-square goodness of fit tests can be used with one or more categorical variables.

For example, let’s say we want to test whether the outcomes (head vs tail) of tossing a coin is statistically significant. Here is the contingency table representing the outcomes. The ask is to evaluate whether it can be told with 95% confidence that the outcome has happened merely by chance and is not statistically significant.

Chi-square goodness of fit - Tossing coin

Lets formulate the hypothesis:

Null hypothesis, H0: The outcomes of tossing a coin is due to chance / randomness. That is, the number of heads and tails such as 70 & 30 has happened by chance is not statistically significant.

Alternative hypothesis, Ha: The outcomes of tossing a coin is not merely due to chance / randomness and is statistically significant.

The significance level is set as 0.05. The expected value is 50 for heads and 50 for tails when the coin is tossed for 100 times. The observed value is 70 heads and 30 tails. Lets calculate the chi-square statistic.

Chi-square value = (70-50)^2 / 50 + (30-50)^2 / 50

= 400/50 + 400/50

= 800 / 50 = 16

At 0.05 as level of significance and degree of freedom as 1, the critical value is 3.841. The chi-square statistics calculated is 16 which is greater than the critical value of 3.841. Thus, the null hypothesis can be rejected. Thus, it can be said with 95% confidence that the coin is not fair as the outcome is statistically significant.

Let’s say, if the no. of head would have come to 58 and tails to be 42, the chi-square statistics would come out to be 2.56 which is less than the critical value 3.841. Thus, the null hypothesis can’t be rejected. This outcome could have happened by chance.

Chi-square test of independence 

The chi-square test of independence is a statistical test used to determine whether two variables are independent of each other. The test is used when the variables are categorical and the data are arranged in a contingency table. The chi-square test statistic is calculated by comparing the observed frequencies in the contingency table to the expected frequencies, based on the assumption of independence. If the two variables are independent, then the chi-square statistic will be small. If there is a significant association between the two variables, then the chi-square statistic will be large.

How do you perform a Chi-square test in Python?

Chi-square statistics are calculated in Python using the scipy.stats.chisquare() function. This function takes two arguments: an array of observed values and an array of expected values. The chi-square statistic is calculated by subtracting the expected values from the actual values, squaring the result, and dividing by the expected values. The resulting value is then compared to a critical value to determine whether or not the difference is statistically significant.

The example of tossing the coin 100 times and getting the outcome as 70H and 30T as shown in previous section can be tackled using Python in the following manner. The chi-square value can be calculated as the following:

from scipy.stats import chisquare
chisquare([70, 30], f_exp=[50, 50])

The output from above is the following:

Power_divergenceResult(statistic=16.0, pvalue=6.334248366623988e-05)

Given that the p-value is less than 0.05, the null hypothesis can be rejected.

Some examples of how to use the Chi-square test in practice

The following are some real-world examples of Chi-square test in use.

  • In a study of 1000 students done in 2020, 575 were female and 425 were male. The study done in 2022 found 585 as female and 415 as male. Is the variation statistically significant?
  • In a study of 1000 people in 2015, 60% were smokers and 40% were non-smokers. In 2022, this study presented 700 people as smokers. Is the variation statistically significant?

That’s it for our introduction to the Chi-square (χ2) test. We hope this article has given you a better understanding of what this powerful tool can do, and how to use it. If you have any questions or would like more information, please don’t hesitate to reach out to us. We’re always happy to help!

Ajitesh Kumar
Follow me
Latest posts by Ajitesh Kumar (see all)

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. For latest updates and blogs, follow us on Twitter. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking
Posted in Data Science, Python, statistics. Tagged with , .

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload the CAPTCHA.