Sunday, August 27, 2017

New p-Value Thresholds for Statistical Significance

This is presently among the hottest topics / discussions / developments in statistics.  Seriously.  Just look at the abstract and dozens of distinguished authors of the paper below, which is forthcoming in one of the world's leading science outlets, Nature Human Behavior.

Of course data mining, or overfitting, or whatever you want to call it, has always been a problem, which has always warranted strong and healthy skepticism regarding alleged "new discoveries".  But the whole point of examining p-values is to AVOID anchoring on arbitrary significance thresholds, whether the old magic .05 or the newly-proposed magic .005.  Just report the p-value, and let people decide for themselves how they feel.  Why obsess over asterisks, and whether/when to put them next to things?


Reading the paper, which I had not done before writing the paragraph above (there's largely no need, as the wonderfully concise abstract says it all), I see that it anticipates my objection at the end of a section entitled "potential objections":
Changing the significance threshold is a distraction from the real solution, which is to replace null hypothesis significance testing (and bright-line thresholds) with more focus on effect sizes and confidence intervals, treating the P-value as a continuous measure, and/or a Bayesian method.
Here here! Marvelously well put.

The paper offers only a feeble refutation of that "potential" objection:
Many of us agree that there are better approaches to statistical analyses than null hypothesis significance testing, but as yet there is no consensus regarding the appropriate choice of replacement. ... Even after the significance threshold is changed, many of us will continue to advocate for alternatives to null hypothesis significance testing. 
I'm all for advocating alternatives to significance testing.  That's important and helpful.  As for continuing to promulgate significance testing with magic significance thresholds, whether .05 or .005, well, you can decide for yourself.

Redefine Statistical Significance
By:Daniel Benjamin ; James Berger ; Magnus Johannesson ; Brian Nosek ; E. Wagenmakers ; Richard Berk ; Kenneth Bollen ; Bjorn Brembs ; Lawrence Brown ; Colin Camerer ; David Cesarini ; Christopher Chambers ; Merlise Clyde ; Thomas Cook ; Paul De Boeck ; Zoltan Dienes ; Anna Dreber ; Kenny Easwaran ; Charles Efferson ; Ernst Fehr ; Fiona Fidler ; Andy Field ; Malcom Forster ; Edward George ; Tarun Ramadorai ; Richard Gonzalez ; Steven Goodman ; Edwin Green ; Donald Green ; Anthony Greenwald ; Jarrod Hadfield ; Larry Hedges ; Leonhard Held ; Teck Hau Ho ; Herbert Hoijtink ; James Jones ; Daniel Hruschka ; Kosuke Imai ; Guido Imbens ; John Ioannidis ; Minjeong Jeon ; Michael Kirchler ; David Laibson ; John List ; Roderick Little ; Arthur Lupia ; Edouard Machery ; Scott MaxwellMichael McCarthy ; Don Moore ; Stephen Morgan ; Marcus Munafo ; Shinichi Nakagawa ; Brendan Nyhan ; Timothy Parker ; Luis PericchiMarco Perugini ; Jeff Rouder ; Judith Rousseau ; Victoria Savalei ; Felix Schonbrodt ; Thomas Sellke ; Betsy Sinclair ; Dustin TingleyTrisha Zandt ; Simine Vazire ; Duncan WattsChristopher Winship ; Robert Wolpert ; Yu XieCristobal Young ; Jonathan Zinman ; Valen Johnson

We propose to change the default P-value threshold for statistical significance for claims of new discoveries from 0.05 to 0.005.

Friday, August 25, 2017

Flipping the https Switch

I just flipped a switch to convert No Hesitations from http to https, which should be totally inconsequential to you -- you should not need to do anything, but obviously let me know if your browser chokes.  The switch will definitely solve one problem:  Chrome has announced that it will soon REQUIRE https.  Moreover, the switch may help with another problem.  There have been issues over the years with certain antivirus software blocking No Hesitations without a manual override.  The main culprit seems to be Kaspersky Antivirus.  Maybe that will now stop.

Sunday, August 20, 2017

Bayesian Random Projection (More on Terabytes of Economic Data)

Some additional thoughts related to Serena Ng's World Congress piece (earlier post here, with a link to her paper):

The key newish dimensionality-reduction strategies that Serena emphasizes are random projection and leverage score sampling.  In a regression context both are methods for optimally approximating an NxK "X matrix" with an Nxk X matrix, where k<<K. They are very different and there are many issues. Random projection delivers a smaller X matrix with columns that are linear combinations of those of the original X matrix, as for example with principal-component regression, which can sometimes make for difficult interpretation.  Leverage score sampling, in contrast, delivers a smaller X matrix with columns that are simply a subset of those of those of the original X matrix, which feels cleaner but has issues of its own.

Anyway, a crucial observation is that for successful predictive modeling we don't need deep interpretation, so random projection is potentially just fine -- if it works, it works, and that's an empirical matter.  Econometric extensions  (e.g., to VAR's) and evidence (e.g., to macro forecasting) are just now emerging, and the results appear encouraging.  An important recent contribution in that regard is Koop, Korobilis, and Pettenuzzo (in press), which significantly extends and applies earlier work of Guhaniyogi and Dunson (2015) on Bayesian random projection ("compression").  Bayesian compression fits beautifully in a MCMC framework (again see Koop et al.), including model averaging across multiple random projections, attaching greater weight to projections that forecast well.  Very exciting!

Monday, August 14, 2017

Analyzing Terabytes of Economic Data

Serena Ng's World Congress piece is out as an NBER w.p.  It's been floating around for a long time, but just in case you missed it, it's a fun and insightful read:

Opportunities and Challenges: Lessons from Analyzing Terabytes of Scanner Data
by Serena Ng  -  NBER Working Paper #23673.

(Ungated copy at


This paper seeks to better understand what makes big data analysis different, what we can and cannot do with existing econometric tools, and what issues need to be dealt with in order to work with the data efficiently.  As a case study, I set out to extract any business cycle information that might exist in four terabytes of weekly scanner data.  The main challenge is to handle the volume, variety, and characteristics of the data within the constraints of our computing environment. Scalable and efficient algorithms are available to ease the computation burden, but they often have unknown statistical properties and are not designed for the purpose of efficient estimation or optimal inference.  As well, economic data have unique characteristics that generic algorithms may not accommodate.  There is a need for computationally efficient econometric methods as big data is likely here to stay.

Saturday, August 12, 2017

On Theory, Measurement, and Lewbel's Assertion

Arthur Lewbel, insightful as always, asserts in a recent post that:
The people who argue that machine learning, natural experiments, and randomized controlled trials are replacing structural economic modeling and theory are wronger than wrong.
As ML and experiments uncover ever more previously unknown correlations and connections, the desire to understand these newfound relationships will rise, thereby increasing, not decreasing, the demand for structural economic theory and models.
I agree.  New measurement produces new theory, and new theory produces new measurement -- it's hard to imagine stronger complements.  And as I said in an earlier post,
Measurement and theory are rarely advanced at the same time, by the same team, in the same work. And they don't need to be. Instead we exploit the division of labor, as we should. Measurement can advance significantly with little theory, and theory can advance significantly with little measurement. Still each disciplines the other in the long run, and science advances.
The theory/measurement pendulum tends to swing widely.  If the 1970's and 1980's were a golden age of economic theory, recent decades have witnessed explosive advances in economic measurement linked to the explosion of Big Data.  But Big Data presents both measurement opportunities and pitfalls -- dense fogs of "digital exhaust" -- which fresh theory will help us penetrate.  Theory will be back.

[Related earlier posts:  "Big Data the Big Hassle" and "Theory gets too Much Respect, and Measurement Doesn't get Enough"]

Saturday, August 5, 2017

Commodity Connectedness

Forthcoming paper here
We study connectedness among the major commodity markets, summarizing and visualizing the results using tools from network science.

Among other things, the results reveal clear clustering of commodities into groups closely related to the traditional industry taxonomy, but with some notable differences.

Many thanks to Central Bank of Chile for encouraging and supporting the effort via its 2017 Annual Research Conference.