Tuesday, July 22, 2014

The third times a charm... 5 sigma

My last several posts concerned the observation of extra-terrestrial neutrinos.  The IceCube analysis, published in November, used 2 years of data to find evidence 'at the 4-sigma (standard deviation) level.' 4-sigma (standard deviations) refers to the statistical significance of the result.  If the probabilities for our observation are described by a Gaussian ('normal') distribution, then 4-sigma corresponds to a roughly 1 in 16,000 probability of being a statistical fluctuation.  This sounds like a pretty high standard, but, at least in physics, it is not normally considered enough to claim a discovery.  There are two reasons for this.

The first is known as 'trials factors.'  If you have a complex data set, and a powerful computer, it is pretty easy to do far more than 16,000 experiments, by, for example, looking in different regions of the sky, looking at different types of neutrinos, different energies, etc.  Of course, within a single analysis, we keep track of the trial factors - in a search for point sources of neutrinos, we know how many different (independent) locations we are looking in, and how many energy bins, etc.  However, this becomes harder when there are multiple analyses looking for different things.  If we publish 9 null results and one positive one, should we dilute the 1 in 16,000 to 1 in 1,600?   

In the broader world of science, it is well known that it is easier (and better for ones career) to publish positive results rather than upper limits.   So, the positive results are more likely to be published than negative.   Cold fusion (which made a big splash in the late 1980's) is a shining example of this.  When it was first reported, many, many physics departments set up cells to search for cold fusion.  Most of them didn't find anything, and quietly dropped the subject.  However, the few that saw evidence for cold fusion published their results.  The result was that a literature search could find almost as many positive results as negative ones.  However, if you asked around, the ratio was very different. 

The second reason is that the probability distribution may not be described by a Gaussian normal distribution, for any of a number of reasons.  One concerns small numbers; if there are only a few events seen, then the distributions are better described by a Poisson distribution than a Gaussian.  Of course, we know how to handle this now.  The second reason is more nebulous - there may be systematic uncertainties that are not well described by a Gaussian distribution.  Or, there may be unknown correlations between these systematic uncertainties.   We try to account for these factors in assessing significance, but this is a rather difficult point.

Anyway, that long statistical digression explains why our November  paper was entitled "Evidence for..." rather than "Discovery of..." 

Now, we have released a new result, available at  http://arxiv.org/abs/arXiv:1405.5303.  The analysis has been expanded to include a third year of data, without any other changes.  The additional year of data included another 9 events, including Big Bird.  The data looked much like the first two years, and so pushed the statistical significance up to 5.7 sigma.  Leading to the title "Observation of..."     Of course, 5 sigma doesn't guarantee that a result is correct, but this one looks pretty solid.

With this result, we are now moving from trying to prove the existence of ultra-high energy astrophysical neutrinos to characterizing the flux and, from that, hopefully, finally pinning down the source(s) of the highest energy cosmic rays.

No comments:

Post a Comment