When to Use Z-test vs T-test: Differences, Examples

When it comes to statistical tests, z-test and t-test are two of the most commonly used. But what is the difference between z-test and t-test? And when should you use Z-test vs T-test? In this blog post, we will answer all these questions and more! We will start by explaining the difference between z-test and t-test in terms of their formulas. Then we will go over some examples so that you can see how each test is used in practice. As data scientists, it is important to understand the difference between z-test and t-test so that you can choose the right test for your data. Let’s get started!

Difference between Z-test and T-test

Z-test is a statistical hypothesis testing technique which is used to test the null hypothesis in relation to the following given that the population’s standard deviation is known and the data belongs to normal distribution:

  • There is no difference between the sample and the population. Or, the difference between the sample and the population mean is not statistically significant. This hypothesis can be tested using one-sample Z-test for means. In other words, one-sample Z-test for means can be used to test the hypothesis that the sample belongs to the population. In this test, the mean of the sample is compared against the population mean in the sampling distribution.

    For example, suppose a researcher wants to investigate if the average height of students in a particular university differs from the average height of college students across the country. They could collect a random sample of students from the university and calculate the mean height of the sample. They can then conduct a Z-test to determine if the difference between the sample mean and the population mean is statistically significant or not.

    The formula for Z-statistics for one-sample Z-test for means is given below. The standard error in the formula given below is the standard deviation of the sampling distribution of the mean which is the distribution of all possible sample means that could be obtained from the population. Read greater details in this blog, one-sample Z-test for means. The z-statistic measures how many standard deviations the sample mean is from the population mean. It is used to determine the statistical significance of the difference between the sample mean and the population mean.

           Z = (X̄ – µ)/SE

              = (X̄ – µ)/σ/√n, , where SE is the standard error, is the sample mean, µ is the population mean, σ is the population standard deviation and the n is the sample size

  • There is no difference between the two independent samples. Or, the difference between the two sample means is not statistically significant. This hypothesis can be tested using two-sample Z-test for means. A two-sample z-test for means is a statistical test used to compare the means of two independent samples. The null hypothesis for a two-sample z-test for means states that there is no significant difference between the means of the two samples. The formula for Z-statistics is the following. Read further details in this blog, Two-sample Z-test for means.

    two sample z-test for means formula and examples

  • There is no difference between the hypothesized proportion and the theoretical population proportion. This hypothesis can be tested using one-sample Z-test for proportion. Greater details can be read in this blog, one-sample Z-test for proportion.
  • There is no difference between the proportions belonging to two different populations. This hypothesis can be tested using two-sample Z-test for proportions. Greater details can be read in this blog, two-sample Z-test for proportions.

T-test is a statistical hypothesis technique which is used to test the null hypothesis in relation to the following given the population standard deviation is unknown, data belongs to normal distribution, and the sample size is small (size less than 30)

  • There is no difference between the sample mean and the population mean given the population standard deviation is known and the sample size is small, or, the population standard deviation is unknown. This is very much similar to one-sample Z-test for means. Greater details can be read in this blog, one-sample t-test for means. The formula for t-statistics look like the following. Note that the sample mean is compared with the population mean as like in one sample Z-test. However, the difference lies in how the standard error is calculated as the ratio of standard deviation of the sample and the square root of the sample size.

             T = (X̄ – μ) / SE

                = (X̄ – μ) / S/√n, where SE is the standard error, is the sample mean, µ is the population mean, S is the sample standard deviation and the n is the sample size. Note the difference between the Z-statistics and T-statistics in one-sample Z-test and one-sample T-test in relation to usage of population standard deviation σ in case of Z-test while sample standard deviation, S in case of T-test.

  • There is no difference between the two populations given the population standard deviation is known and the sample size is small, or, the population standard deviation is unknown. This hypothesis can be tested using two-samples t-test for independent samples. In case of two-samples t-test for independent samples, different formula exists in case the variance of the two populations are equal or otherwise. In case the population variances are unequal, the pooled variance is used to calculate the T-statistics. Read further details about two-sample t-test for independent samples in this blog, two-samples t-test for independent samples: formula and examples. Note the difference between the formula for two-samples Z-test for means and the two-samples t-test for means in the respective blogs. The formula for two-samples t-test for independent samples given population variances are equal is the following:

    t-statistics given the population standard deviations are unequal

Other differences between the Z-test and T-test are the following:

  • While Z-test makes use of Z-distribution or standard normal distribution, T-test makes use of T-distribution.
  • While T-test makes use of degree of freedoms for calculation of T-statistics, Z-test don’t need the determination of degrees of freedom.
  • For independent samples with equal variance, use t-statistics instead of z-tests as z-test only applies when populations don’t differ too much in their respective standard deviations.  

When to use Z-test vs T-test?

The following is a simplistic diagram which specifies when to use Z-test vs T-test:

z-test vs t-test

Note some of the following in the above diagram:

  • If the population standard deviation is known and the sample size is greater than 30, Z-test is recommended to be used.
  • If the population standard deviation is known, and the size of the sample is less than 30, T-test is recommended. 
  • If the population standard deviation is unknown, T-test is recommended. 

Depending upon the previous considerations, one can select Z-test or T-test depending upon the types of hypothesis being tested:

  • If we need to compare means of two independent samples (e.g., two different groups), we can use either a Z-test or a T-test.
  • If we need to compare means of paired samples (e.g., before and after measurements), a paired T-test is typically used.

Summary

The z-test and t-test are different statistical hypothesis tests that help determine whether there is a difference between two population means or proportions. The z-statistic is used to test for the null hypothesis in relation to whether there is a difference between the populations means or proportions given the population standard deviation is known, data belongs to normal distribution, and sample size is larger enough (greater than 30). T-tests are used when the population standard deviation is unknown, the data belongs to normal distribution and the sample size is small (lesser than 30).

Ajitesh Kumar
Follow me

Ajitesh Kumar

I have been recently working in the area of Data analytics including Data Science and Machine Learning / Deep Learning. I am also passionate about different technologies including programming languages such as Java/JEE, Javascript, Python, R, Julia, etc, and technologies such as Blockchain, mobile computing, cloud-native technologies, application security, cloud computing platforms, big data, etc. For latest updates and blogs, follow us on Twitter. I would love to connect with you on Linkedin. Check out my latest book titled as First Principles Thinking: Building winning products using first principles thinking
Posted in Data Science, statistics. Tagged with , .

5 Responses

Leave a Reply

Your email address will not be published. Required fields are marked *

Time limit is exhausted. Please reload the CAPTCHA.