By Dr. Marloes Nitert and Dr. Sidney Dekker
Those who’ve been around safety (and particularly safety differently) long enough, know that LTI (Lost Time Injuries) is a lousy safety measure. LTI, after all, was once instituted as a productivity measure, not a safety measure. But LTI is actually quite a silly measure too. This blog shows just how silly it gets, and how foolish (or statistically meaningless) any claims about LTI reduction really are.
We were working with a company with 85 employees once that was very proud of their safety record. Over the past four years, their injuries dropped from 19, to 7, to 4, and then to 1. What a marvelous accomplishment! Managers were understandably glowing. Of course, any reduction in actually hurting people is a good thing. Any reduction in honesty around reporting, or not calling injuries for what they are (rather than creatively case-managed instances of ‘light duties’) is of course not a good thing. But that didn’t seem the issue as much here.
The issue was how the managers felt about their own interventions and actions. They’d done a bunch of things (like putting up posters telling everybody else to be more careful out there, and affixing stickers on bathroom mirrors telling that you were looking at the person most responsible for your safety) that they believed were responsible for this amazing reduction. They also did some more training of their people, and reminded them of the appropriate protective equipment to wear with various tasks.
Like in most industries, however, the injury numbers were so small
Here is why. Remember, the company had 85 employees. We established that together they worked 170,000
- There is a 92% probability that the injury reduction is just random noise. In other words, a fat chance (very fat: 92% fat) that the managers had nothing, absolutely nothing to do with the reduction.
- In science, we typically want to be 95% certain of something before we claim that it happened, or that we know what might have caused it to happen. So let’s translate the manager’s claim that their actions and interventions were responsible for the injury drop. If we would want to claim that in any seriousness (or science), then to claim with 95% certainty that the injury reduction wasn’t just noise but the result of what managers did, the 85 workers would together have to suffer 20,400 injuries in year 1, down to 1,020 injuries in year 4.
- You could also look at it the other way around. If the company insisted that a drop from 19 to 1 injuries was 95% certain to be real and not just random variation or noise, it would have to employ an additional 53,129 workers
The results show the dire need for managerial and board humility for claiming credit for LTI reductions. They also show the tyranny of small numbers and uselessness of LTI as a safety measure. And let’s not forget: LTI says nothing about severity of injury or length of absence from work. If it wasn’t useless as a measure of anything already, then that certainly makes it so.
What does statistically significant mean?
Statistically significant, formally speaking here, is the likelihood that the reduction in the number of injuries is caused by the intervention rather than by chance. If a manager or board wants to be confident that a reduction in injuries is due to what they did, rather than mere random variations (which can go as wildly up as they can go down!) then the numbers need to meet various stringent requirements.
Remember the company we started with here:
- 85 workers
- Three 8-hour shifts, 5 days a week, which amounted to 170,000 hours worked (because each employee worked 50 weeks a year)
- Injury reduction over four years: from 19, to 7, to 4, to 1..
Of course, just these numbers don’t mean much. What if the company did some serious downsizing during those 4 years, and in the last year they had only 1 employee left (not 85), and then that poor sod managed to get him- or herself injured? Or the company actually managed to hold on to their 85 people but they had nothing to do because there really wasn’t any work? This is why we need an injury rate, that is the number of injuries against the number of hours worked.
In this company, people worked 170,000 hours worked annually The injury rate dropped from 0.011% (19 injuries/170,000 hours worked x 100) down to 0.0006% (1 injury/170,000 hours worked x100) over four years.
A manager would of course love to claim that the drop from 19 injuries to 1 injury is significant. In a sense, of
What is the probability that the injury rate reduction is just random noise?
If you look at absolute numbers, then a drop from 19 to 1 worker on 85 workers, then, yes, you might be able to make the claim that such a drop is statistically significant. A statistical test to ‘prove’ this with, is called Chi-square (which isn’t very exact with such small numbers in any case, but we have no more to go on in this case).
The probability that the injury rate drop from 19 to 1 is just random noise, is 92%
Of course, absolute numbers are not a rate: they are not an injury frequency rate. A reduction in the absolute number of injured workers out of the 70 from one year to the next is nice (really nice, for sure). But it is extremely likely that it is pure chance. In other words, there is no basis to claim that the drop is due to what the manager(s) or the board did. There is, in fact, a way to formally calculate the chance that indeed the reduction is purely chance. (Got that?)
We can calculate this by going back to the injury rate (rather than the absolute number of injuries) and doing the same Chi Square test between year 1 and year 4. Which shows the following:
|Injury rate||Non-injury rate||Total|
P value = 0.92: this means that there is only an 8% chance that the difference in injury rate between years 1 and 4 is real and sustainable if the managers keep doing what they are doing. But there is a 92% chance that this difference is just utterly random, and completely disconnected from whatever the managers are doing.
It becomes clear that the reduction is highly likely due to chance. In fact, the chance that the drop is just a random variation is a whopping 92%. So any manager(s) claiming that the drop is due to their interventions and actions is not really to be believed. Well, if you insist, you should only believe the manager(s) 8% of the time; or only 8 out of a 100 times that they make the claim. Or, you should only believe them for just over half an hour during a workday: the rest of the day they’re just blabbing, or spouting grandiose nonsense.
How many people or injuries do you need to claim statistical significance?
Of course, a manager would want to be believed (or believe themselves) more times than 8 out of a 100. The typically desired level of statistical significance is 95% (P value = 0.05) in cases such as these. That would mean that the manager or board can be believed 19 out of 20 times they make the claim. Once out of twenty times, the reduction in injuries would still be the result of random variations. But how many people would the manager need to employ, or how many injuries would these people need to have, to achieve that level of statistical significance (or, to put it differently, a 95% confidence that the injury rate reduction is not random?). We turn to this now.
So how many injuries would you actually need to begin with (and then go down to) if you want to claim with 95% certainty that the drop is due to your interventions and actions, and not just a random variation?
Let’s ask that question again, but now in regard to the figures (a 19-fold drop) that we have already discussed:
If a manager of a company with 85 workers, who together work 170.000 hours per year, would want to claim, with 95% certainty, that a 19-fold drop in injuries in his or her company is due to his or her interventions and actions, then how many people would he or she need to employ extra, or how many injuries would he or she actually need in absolute numbers, from year 1 to year 4?
We find the two answers through what is called a statistical power calculation. The main purpose of a power analysis, in formal terms, is to determine the smallest sample size that is suitable to detect the effect of a given test at the desired level of significance. In other words, if a manager wants to be 95% certain (a typical level of statistical significance) that a 19-fold drop in injuries is due to his or her actions, how many workers or injuries would he or she need to demonstrate this?
How many people do you need to employ if you have 19 injuries?
Statisticalpower calculations can be done in various ways. The most common way is for it to determine the sample size necessary to be 95% certain that the effect was actually the result of the interventions or actions the manager made. We can use, in other words, a statistical power calculation to find out how many people the manager would need to employ if he or she wants to be 95% certain that his or her injury frequency rate drops from 0.011 to 0.0006 (as from 19 to 1, like in this example).
In that case, the power calculation shows that the manager would need to employ 53,214 people rather than 70. And he or she would still be wrong to claim that the drop was due to their actions 1 out of 20 times. So only by the time the manager has employed the 53,214th person (while keeping all the 53,213 others at the same time!) can he or she be 95% sure that a drop from 19 injuries to one injury is due to his or her interventions and actions. Only with a sample that large can such certainty be claimed.
Formula for sample size calculation
where is 0.05
z is the normalized score
π is the proportion (injury rate) with p0is 0.00011 and p1is 0.000006
This does mean, by the way, that a company which actually employs some 65,000 people, and which reports a drop from 19 injuries to 1 injury across allof those 65,000 people, couldhave enough statistical power (depending on the actual hours worked by all those people) to support the claim that the drop is due to what managers did. But not a company or a site with 85 people.
If you employ 85 people, how many injuries do they need to suffer for you to know that your interventions are actually working?
The other way in which we can use a statistical power calculation is to approach the number of injuries that would be necessary to claim a statistically significant reduction in a sample of only 85 workers. For this we did multiple calculations, setting the 85 individuals (n=85) in the context of different numbers of employees:
|Injury rate Year 1||Injury rate Year 4||Number of injuries Year 1||Number of injuries Year 4||Number of individuals needed|
It shows that the manager who employs 85 people would need to record a drop from 20,400 injuries in Year 1 to 1,020 injuries in Year 4. This is an injury rate drop from 11.8% to 0.62% (which is a 19-fold reduction). Only with numbers this large would the manager be able to claim that the reduction is not due to chance. Although, do note, that he or she can be believed only 19 out of 20 times when the claim is made. In other words, in 1 out of 20 times it is still possible that the reduction is actually due to chance after all.
This also applies to comparing injury rates between companies or sites
We can use the same calculations to show that comparing injury rate between companies is largely a fool’s errand. If company A has 19 injuries in one year, and company B has 1 injury in the same year (and their workforces are comparably large and they work roughly the same hours in a year), then this is just a random variation, a pure chance difference, unless (with such a low injury rate) the companies employ some 65,000 people each.
With injury numbers relative to hours worked (i.e. injury rate or any other rate) as low as they are, it becomes easy to show that the requirements of statistical significance are never met. In other words, managers or boards claiming that they have seen a significant reduction in injury rate, or a significant difference between their injury rate and someone else’s injury rate, actually have no statistical basis for their claims. It’s literally mostly make-belief.