1. What is T-Test
The T-Test or also called Student’s T-Test is one of statistical techniques which compares mean of two samples in order to tell us if they come from the same population or not.
T - Student Distribution |
2. Case Study
You work for an investment bank and you are making decision to invest your client's money. You're thinking about investing in Telecom or Hospitality but you're not so sure which one you will go with. So you're gonna make a test.
In order to do that, we randomly grab last quarter revenue of 20 companies of both industries. In this case, we don't know the variance of collected samples. We will compute and compare the means of both samples to verify the normality of their distribution.
The correct way to compare statistics is to define a hypothesis. Null Hypothesis (NH) is means of Telecom and Hospitality equals and Alternative Hypothesis (AH) is their mean is different.
NH: Mean (Telecom) = Mean (Hospitality)
AH: Mean (Telecom) # Mean (Hospitality)
Read data file and show statistical description of each sample:
Data Description of Telecom and Hospitality |
Have a look how data is distributed in histogram and box plot:
Hospitality Revenue Distribution |
Telecom Revenue Distribution |
Calculate p-value from two independent samples with scipy in Python:
p_value from t_test is < 0.5% so we can reject NH. We can say that both data sample comes from different population
Data file and Notebook is available on Github
No comments:
Post a Comment