The Wells Report, the NFL’s official report on the DeflateGate scandal, was written to try to end the scientific discussion on the football pressures as measured at halftime of the AFC Championship game between the Colts and the Patriots.
It still causes a lot of confusion.
If one were to give a peer review** of the scientific side of the report (the report was done by the consulting company Exponent), and to suggest areas that might clear up some concerns with further investigations or result in an entirely different conclusion, these are the problem areas:
- Artificial Turf Field Temperature — Surprisingly, this was not considered in the Wells Report at all. Under the artificial grass would have been cold sand and cold ground, frozen by the cold January weather leading up to the extraordinarily warm (for January) 50oF temperature at game time. The turf fields are designed to bring cooling from underneath to the surface. This would have cooled balls sitting on the grass — in particular the Patriots balls as they had a long sustained drive occurring within 20 minutes of the halftime measurements. This may explain a wide distribution in pressures of the Patriots balls, and the overall lower average pressure. Consideration of this may totally exonerate the Patriots. The investigators must look into this.
- Uncertainty — A great deal of the report deals with the uncertainty in the pressure measurements themselves, which gauges were used, the temperature of the locker room, and the timing of those measurements. In the midst of all this uncertainty, the conclusion reached was based on the Patriots average pressure having an uncertainty that was just outside the range of their model, and timing of measurement that was just outside their model as well. The closeness with which these measurements were to the uncertainty range, not only requires accurate accounting of the averages, but accurate accounting of the uncertainty as well. This is true of the measurements by the gauges, and the uncertainty of how well the model and apparatus the investigators put together actually matched the game time conditions. Regarding the model, there is not enough discussion about this, nor any experimental error bars that show on their charts. The control on this was the measurement of the Colt’s balls. Unfortunately, only 4 of the Colt’s balls were measured, and we don’t know if those balls were used in the game and were wet, or if they were not used in the game and were dry. We don’t know if the ball boys held them in their hands and warmed them, or what happened to them. These are difficult to consider as a proper control. The investigators should present their results as an investigation of how their game time simulation with its proper experimental uncertainty, and the Ideal gas law match the results of both sets of balls, rather than try to adjust by making the Colt’s balls a control group.
- Evaporative Cooling and Uncertainty — A systemic difference between the pressure on a wet ball, and the pressure on a dry ball can be noted in the Investigators results. Although it is not explained in the report, this might be expected to reflect the “dry-bulb”, and the “wet-bulb” temperatures found in meteorology. The root of this is really evaporative cooling. This is a “systemic” result, and not an experimental uncertainty error as is suggested in the charts shown by the investigators. The charts should have reflected an uncertainty in starting temperature, and any other uncertainty that might occur in their model which might be observed when repeating the experiment, or trying out various other conditions as they describe. Since the uncertainty is critical in coming to the final conclusion of this report, this should be dealt with correctly. Even this change in uncertainty modeling — because the Patriots average pressure was just outside the range shown — might lead to a different conclusion.
There appear to be three competing theories to describe the football pressures as measured at halftime of the AFC Championship game between the Colts and the Patriots:
- Someone let air out of some of the Patriots footballs.
- There is so much uncertainty surrounding the measurements that no definite conclusion can be reached.
- There are physical processes not accounted for in the Wells Report which caused the temperatures of some of the balls to be lowered to the point to which measured pressures were achievable.
The Wells Report concluded that theory #1 is the most “probable” explanation of what occurred. They came to that conclusion by accounting for all known sources of uncertainty, and by doing a comprehensive review of any and all physical processes which might have accounted for changes in the ball pressures. They created an apparatus to model the “game day scenario” to demonstrate what might have happened at Foxboro that day especially taking into account transient effects due to changes in temperatures when the ball pressures were measured. Based on the experimental results of running that model they reached their conclusion. The data of the Patriots football measurements do not fit their model and the Colts football measurements do, so therefore the Patriots explanation is “not credible”. Someone deflated the balls.
The Patriots contend that theory #2 is the best explanation. The uncertain conditions under which the pressure measurements were taken, the inaccuracy of the equipment, the uncertainty in the timing and the continued claim by everyone interviewed in the case who might be culpable of or colluding with others to let air out of the balls that no such act happened, lead them to believe that this is the best explanation. Their rebuttal of the Wells Report, including many of the non-scientific aspects of the report is published on-line here. (Wells Report Context — From the Patriots)
Theory #3 is something that is mentioned a number of times in the Wells Report in the form of the statement:
“Therefore, subject to the discovery of an as yet unidentified and unexamined factor, the measurements recorded for the Patriots footballs on Game Day do not appear to be completely explainable based on natural causes alone. ( pg.61, Exponent Report)“
This is the suggestion that some new unknown factor might cause their theory to be wrong. That unexamined factor may be the cold sand and ground under the turf field. We start with a discussion of that factor.
Artificial Turf Temperature
The Wells report claims that the air temperature alone is insufficient to have caused the low pressures measured in the balls. But what about the playing surface? This concern about the temperature in and under the artificial turf falls in line with theory #3 — other physical processes not considered.
The cooling due to the ball sitting at the line of scrimmage in between each play, and the fact that the Patriots had a sustained drive just before half-time can explain the wide variety of pressures in the Patriots balls, and the difference between the Colts balls and the Patriots as measured at halftime.
At that time in January there would have been a block of frozen ground underneath Gillette Stadium’s field, and turf fields are designed to allow cooling from below. In fact the turf is designed to have a layer of “high heat capacity” sand which would keep cooler on high temperature days and the grass fibers bring that coolness through a layer of insulating rubber pellets. While in play, the balls would sit on the ball field for extended periods of time, especially the Patriots balls during a sustained drive leading up to halftime. These balls would have experienced some of this cooling, and would have still been feeling those effects during the halftime measurements. Was this just overlooked by the investigators, or did they know about it, and had a good reason for ignoring it?
The report should consider the playing field, and investigate the effects if any.
A separate report found here goes into more detail how this might account for all the low pressures of the balls at halftime, and even the wide variance in distribution of pressure in the Patriots balls that was found.
The Gauge: Which one was Used?
Walt Anderson, the Referee who measured the pressure in the footballs before the game had two pressure gauges. Subsequent tests by Exponent showed that the gauges measure about .3psi to .45psi apart. The controversy is over which gauge was used in that first pressure measurement. The choice is crucial. As is pointed out by the Patriots, if we just use measurements using the gauge Walt Anderson recalled he used during the initial measurement, then the average pressure measured in the 11 footballs measured at halftime is consistent with the pressure expected by the Ideal Gas Law.
The Wells report shows this in Fig 27 of their report. This illustrates a model which the Wells Report claims shows the time dependence of where the Ideal Gas Law for a football sitting out in the open air of the locker room warming up while being measured. At the left side of the chart, within the experimental uncertainty of the football measurements, the Patriots balls convincingly overlap with the value expected by the model!
The Wells report contends that time dependence is the problem. The balls took 3-8 minutes to be measured, and during that time they should have been warming up. As shown on the chart (look at the brown curves), the flat brown line with uncertainty bars around it shown in light brown, after 4 minutes starts to lie outside the wet-ball curve (the dashed brown line), and by the time we reach 8 minutes is almost 1.5 times the size of the uncertainty bars outside the curve.
If we use the other gauge, then the average pressure was significantly lower, and must be explained some other way. Since the investigators fail to convincingly remove the uncertainty surrounding the gauges their conclusions should only be based on ruling out measurements made by both gauges.
Key to the conclusion of the report is understanding the uncertainty surrounding the pressure measurements of the balls, and in particular determining if that uncertainty can be reduced to a level that any conclusion at all can be reached. The investigators reached the conclusion that it was “probable” that someone had removed air from the balls.
Unfortunately a game day scenario for football manufactured in a lab would seem to have a lot of uncertainty about it no matter how carefully scientists try to reproduce, or account for the weather and other game conditions. To show the only uncertainty in the experimental game day model as being due to dry/wet ball measurements, and then base conclusions as far reaching as they do is questionable.
Why aren’t there shaded uncertainty areas around the pressure lines depicted in graphs in Fig’s 26 and 27, on which so much of the conclusion of this report rests? At a minimum we know there is an uncertainty in the initial measurement temperature that should be represented directly on these charts. If that uncertainty is .1psi – .2 psi, it can change how people view this chart considerably. And certainly that uncertainty in the temperature measurement should not be used or manipulated as suggested in the statement “the various temperatures were adjusted such that the measurements obtained via these simulations correspond to the Colts measurements” (pg. 57 of Exponent report). Please, just put the correct uncertainty estimates in, and let the chips fall where they may with the data. It is important to see that the Colts data does not align perfectly with the models prediction if that is the case.
It is also important to note that the difference between the wet-ball curve, and the dry-ball curve could be a systemic difference (see Evaporative Cooling below), and thus do not and should not represent experimental error limits for these curves! The investigators should be able to control for these systemic effects, or suggest that some uncertainty exists because we do not know the humidity and/or air circulation that existed within the officials locker room, and representing some bound on that type of uncertainty. On some of the diagrams in the report, the wet-dry difference is about .1 – .2psi. On other charts, like the one above, it appears to be about .5psi.
Directly related to those pressure lines is the Fig 21 graph (not shown), where the rising pressure for the dry balls does not seem to follow an exponential curve. This suggests some other systemic problem, or at least should be explained by the investigators as to why it may be that way.
Although the investigators do a great job reducing and explaining much of the uncertainty surrounding pressure gauges, squashing footballs, stretching football leather, etc, they need to provide more accounting for uncertainty in their own model on which they base their conclusions.
Finally, it is not unusual for results to lie outside their uncertainty estimate. With a “normal” distribution approximately one third of the time this will occur. It is just more “probable” that by repeating the experiment it will lie within the estimated uncertainty. Given how close this range is as shown in Fig. 27 above, the only conclusion that could be reached is that it is only “probable”, but not impossible that that the pressures in the Patriots balls could not be explained within the model. Further experimentation including the cold turf, and showing some uncertainty due to our not knowing the humidity/evaporative cooling effects
Using the Colts Balls as Controls
A “control group” in an experiment is a set of items (in this case the Colts footballs) which are statistically similar to the the other group (the Patriot footballs) but not subjected to the changes which are being tested for. Unfortunately, this was only a subset sample of the Colts balls. These 4 balls may have all been dry. We needed to add in all the balls to be sure we included balls that had been on the field getting cold and wet. Without that, we may be comparing apples with oranges. We know there may be at least some known systemic differences in balls which were wet and dry (see Evaporative Cooling below, and not to mention issues due to their use or non-use in a game as asked in the Artificial Turf question above)? Why not just compare everything to the measurements predicted by the Ideal Gas Law, and the Transient model.
The Distribution in the Pressure of the Patriots Footballs
The distribution of the pressure measurement in the Patriots balls is an inconvenient truth for both sides. They are spread too broadly. This is a glaringly obvious problem if we assume it is just the Ideal Gas Law that took the balls from their narrow distribution before the game (12.5psi +/- .1psi) to the measurements which at half time which are spread over 1.3psi. The Exponent investigators have shown that in the lab the temperature/pressure changes are very repeatable, and follow the Ideal Gas Law. We should not see such a wide distribution.
Even theory #1 — A person letting air out would have been greatly inconsistent in how they let air out — ignoring some balls, and letting much more out of others. Would a random set of pressures be good for a quarterback? Does that make sense?
And theory #2 — While averaging over the halftime pressures in these balls, and obtaining an estimated uncertainty seems reasonable, the nagging question is why did the pressures show such a large variance. It does not seem physically reasonable.
However, theory #3 — this theory makes sense. Very simply, the cold artificial turf theory would explain it as due to some balls being more recently on the field than others.
Systemic Effects: Evaporative Cooling
At least one blog post showed concern about evaporative cooling causing some significant effects on the ball pressure. In one of the figures (Fig. 27 shown above), some experimental results show two curves, one for a dry-ball curve, and one for a wet-ball curve. They differ along this curve by about .5psi. Although the reason for this difference is unexplained in the report, some atmospheric scientists might guess that this is the kind of difference that is accounted for by “dry-bulb” and “wet-bulb” types of measurements. These differences arise because of evaporative cooling lowering the temperature on the wet-bulb (or wet-ball), and thereby lowering the pressure. These effects can be larger depending upon the humidity in the air, and air circulation.
An explanation of these clearly measurable effects, whether they are due to the evaporative cooling or not, is warranted. Particularly since this difference is represented in the graphs Fig 27 and Fig 28, on which the conclusions of the report were founded.
The comments in this paper are meant as a “peer review” of the scientific evidence and experimental procedure presented in the Wells Report, in particular where that may influence the conclusion of the report.
These are some important questions that were raised in a careful reading of the scientific evidence of the report.
The expectation and hope is that the Investigators would look seriously at the effects mentioned here, and answer the questions raised.
Suggestion for the Future
Unfortunately, this experience with the Patriots-Colts game is the first in which the pressure inside the football has been called into question. Undoubtedly, in the future there are some things that could be done better. There could be better recording of pressure. There could be more careful monitoring of the game balls.
In the future, to verify that the balls have not been tampered with, the pressures could be measured ahead of the game as they are now, and then measured again about an hour after the game to see if they come back to their original pressure.
The scientific norm of peer review before publishing a scientific article is a time honored process which leads to fewer mistakes, a focus on critical aspects of the investigation, and wider acceptance of the results. In this process, prior to publication, the article is sent to a handful of independent reviewers who may or may not be known to the lead investigator. These reviewers who have general knowledge of the subject, study the report, and their comments are sent back to the investigator giving that person a chance to improve their publication, or even go back to the drawing board.
In the Wells Report, it appears there was not an outside review process. Dr. Daniel Marlow was the lead investigator, and is clearly an esteemed physicist, but as any scientist knows, the scientific method can be fraught with making bad assumptions, failing to discover or notice key facts, and a lot of trial and error. Discovery quite often comes through recognizing these mistakes, pressing forward, and finding the correct answers.
This, in and of itself, does not mean that mistakes were made in the investigation. And in fact, for the topics that were considered, most scientists agree that the report by Exponent was capably done. However by failing to have this peer review, the group did not avail itself of either a chance that others might identify problems with the report ahead of time, nor avail themselves of the wider acceptance of the results.
***Improving the Game Day Scenario Experiment
A way to further the realism of the “Game Day Scenario” is to include the football game itself. Create a simulated field by laying some damp towels on a layer of ice and let it sit outside at 48F for an hour. Then repeat the above experiment, but instead of dampening the balls with 48F water, periodically bring them out and push and roll them on the damp towels, and let them sit on the towels for 2-3 minutes. Then put them back in the bag. In the last 25 minutes — up to 3 minutes before bringing them inside for measurement, repeat the push and roll on the towels, always having at least one ball out sitting on the towels.