Communicate Many More Means

Business Intelligence Dashboards present frequencies, percentages and averages in one convenient location.   In this example, there are five average “Wait Time” scores compared for meaningful differences.  Each average Wait Time score represents a single week of calls to a Government Department. Total number of calls, or “n” appears in the Total Calls column.  It is important to know “n”, in this case call volumes, when interpreting averages.  Calculated Wait Time averages appear in the Average Wait Time  column (second table). There is no need to guess from the bar chart what the exact values are.

Many More Means

This example is from a Government Website (  It’s a great resource with lots of interesting analytics on key performance indicators.

General Questions From an Operations Manager:

Is the Wait Time average downward trend a good thing?  Are Wait Time averages really different for each week?  Is there a significant improvement in Wait Times over the five-week period?  What does the increase in the last week suggest?  Should something be done to make improvements in our service delivery since it looks like Wait Times are going back up?  Do we hire or fire call centre staff?

Research Questions From the Analytics Manager:

Are the differences in the average Wait Times statistically meaningful?  If they are, what month(s) show the greatest improvement?  Depending on the statistical results, are the changes in average Wait Time practically meaningful?

Hypothesis From the Statistician:

Ho:  There is No statistical difference in average Wait Time for the five weeks.

Ha:  There IS a statistical difference in average Wait Time for the five weeks. 

Statistical Test:

The One-way Analysis of Variance (ANOVA) is the appropriate statistical test to use when comparing the differences between three or more averages.

Results & Interpretation:

In this example, the ANOVA will determine if the observed differences in average Wait Time is statistically different.  The chart suggests a dramatic downward trend with large differences.  The ANOVA test will help managers make a practical interpretation of the apparent trend.  For example, if the downward trend is statistically significant, managers may check customer satisfaction to confirm if a decrease in Wait Time by 41 seconds really improves service? If the trend is statistically significant, managers may also want to reward staff for their excellent performance.

More about the ANOVA test:

ANOVA test in SPSS:


Communicating t-test Results

General Question:  Your employer says, I think men take more sick days then women in this company, and we need to do something about it.  Let me know what the stats are.

Research Question:  Is there a significant difference between the average number of sick days taken by men and women in the past 12 months?

Hypotheses:  There IS NO significant difference between average number of sick days for men and women during the past 12 months (Null Hypothesis).  There IS a significant difference between average number of sick days for men and women during the past 12 months (Alternative Hypothesis).

Analysis Plan: Because the research question is about differences between two averages, Male and Female average sick days, the Independent groups t-test is the best statistic to use.

For more on how to pick the best statistical test please visit:

Calculate Statistic: Calculate the average number of sick days for male and female employees using IBM SPSS.  Use the Independent Groups t-test to determine if the difference between the average Male and Female sick days is significant.

Compute Probability:  When the results of the t-test show that the null hypothesis has less than a 1% chance of being right, we reject it and suggest the alternative hypothesis is worth considering.

For more on the concept of the p-value please visit:

Communicate Results like a Statistician:

Mr. Boss, we used an Independent groups t-test to determine if there is a significant difference in the average number of sick days taken by male and female employees in the past 12 months. You will be happy to know that the results were: t=-3.7341(198), p<.01 and the results were significant.  Therefore we rejected the null hypothesis. 

Therefore there IS a significant difference between the average sick days for men (50.1) and the average sick days of women (54.99) in the past 12 months.  

Women in this company took more sick days than men in the past 12 months.

IBM SPSS Output for Independent Groups t-test:

IBM SPSS t-test Output

IBM SPSS t-test Output

To learn more about the t-test used in this article please visit:

The Statisticians’ Way

The role of classically trained Statisticians is to answer questions with data and communicate the logic behind the results. Rarely does a statistician attempt to bridge the gap between statistical logic and practical interpretation unless there is a content expert working closely with the team.  The typical method for communicating statistical findings follows a seven step process called Hypothesis Testing.  There are many great places online to learn more about Hypothesis Testing (

Step 1 – General Question: Someone asks a question and wants an answer based on numerical evidence, and expects the closest thing to fact that is humanly possible. The questions may sound like this. Is there an HR problem in the Company? Do I need to hire new people? Why are sales higher in the Northeast? What does the public think of our new product? How can we improve our public image? None of these questions are statistically measurable until translated into research questions.

Step 2 – Research Question: This step involves translating general questions into a series smaller, measurable questions. General Question: Is there an HR problem in the Company? Research Question: How trustworthy are the employees in Company X as measured by the Employee Trustworthiness Scale? Research Question: Is trustworthiness different between genders in Company X using the same measure?

Step 3 – Hypotheses: Statisticians use data to answer questions. Since 100% certainty is not possible, statistical answers are given within a degree of measurable certainty, and written as Hypotheses. Hypotheses are “plausible” explanations among many. For example, “There is no significant difference in Trustworthiness between genders” is a plausible Hypothesis to consider. (I will write more about the mechanics of Hypothesis testing in a future article).

Step 4 – Analysis Plan: You may have many Hypotheses to test. Each Hypothesis may require a unique calculation. And, each calculation may have a unique set of assumptions to consider. A well written analysis plan is essential to understanding and communicating the statistical findings in a way that is relevant to the audience.

Step 5 – Calculate a Statistic: The Hypothesis, type of data, and sample/population size dictates the appropriate statistical test. With hundreds of test to choose from, there really is no magic for knowing what test to use. However, there are several “cheat sheets” available online (I will write more later about the mechanics of Hypothesis testing and how to use calculated statistics).

Step 6 – Compute Probability: The calculated value of a statistical test “alone” is not very informative. The Hypothesis testing process uses the calculated value to make inferences. The values are compared to computed probabilities that form the basis of the conclusion (I will write more about the mechanics of Hypothesis testing and probability in future articles).

Step 7 – Present Results: Presenting statistical results is very different from interpreting results. Presenting results follow a structure that may vary slightly depending on the statistic, but generally looks like this:

1. Chose a Test: ie: t-test
2. Calculate a Result: ie: t(df) = t-value, p = p-value
3. Significant? Yes / No
4. Null Hypothesis: Reject or Not Reject
5. Therefore: There IS or IS NOT a significant difference between two means
6. Conclusion: Make a statement that summarizes all previous steps

What’s the Difference?

Continued from the previous blog (Bad BI & Lying Charts –

The bar chart, “Average Ranking by Top Performance Categories” compares Males and Females on the following characteristics: “Flexibility”, “Performance”, and “Trustworthy”.  The Female bar is blue and the Male bar is brown.

When you look at the relative size of the bars, the difference between genders is dramatic. On the “Trustworthy” scale, Males appear 4 times more Trustworthy than Females. Yet, the calculated difference between the two is only 0.12.

Misleading Bar Chart

  • Are Male employees really four times more trustworthy than Females?
  • Is a difference of 0.12 significant?
  • What decisions would you make based on this chart?

There’s a Test for That

It’s okay to answer “not sure” to each question. And it’s equally okay to want to know.  To find out, a statistician would use a t-test.  In this example, a t-test  would determine if Male employees (average = 2.60) are significantly more Trustworthy than Female employees (average = 2.48), as the bar chart would suggest.

There are many tutorials online that teach t-test. For instance: StatsCast: What is a t-test. ( is very good. And without the original data to perform the t-test, I can only suggest that the bar chart is misleading…well very misleading. In fact, it may even be a lie.

t-test defined:  “In simple terms, the t-test compares the actual difference between two means in relation to the variation in the data” (