Monday, November 20, 2017
More on Path Forecasts
I blogged on path forecasts yesterday. A reader just forwarded this interesting paper, of which I was unaware. Lots of ideas and uptodate references.
Thursday, November 16, 2017
Forecasting Path Averages
Consider two standard types of \(h\)step forecast:
(a). \(h\)step forecast, \(y_{t+h,t}\), of \(y_{t+h}\)
(b). \(h\)step path forecast, \(p_{t+h,t}\), of \(p_{t+h} = \{ y_{t+1}, y_{t+2}, ..., y_{t+h} \}\).
Clive Granger used to emphasize the distinction between (a) and (b).
As regards path forecasts, lately there's been some focus not on forecasting the entire path \(p_{t+h}\), but rather on forecasting the path average:
(c). \(h\)step path average forecast, \(a_{t+h,t}\), of \(a_{t+h} = 1/h [y_{t+1} + y_{t+2} + ... + y_{t+h}]\)
The leading case is forecasting "average growth", as in Mueller and Waston (2016).
Forecasting path averages (c) never resonated thoroughly with me. After all, (b) is sufficient for (c), but not conversely  the average is just one aspect of the path, and additional aspects (overall shape, etc.) might be of interest.
Then, listening to Ken West's FRB SL talk, my eyes opened. Of course the path average is insufficient for the whole path, but it's surely the most important aspect of the path  if you could know just one thing about the path, you'd almost surely ask for the average. Moreover  and this is important  it might be much easier to provide credible point, interval, and density forecasts of \(a_{t+h}\) than of \(p_{t+h}\).
So I still prefer full path forecasts when feasible/credible, but I'm now much more appreciative of path averages.
(a). \(h\)step forecast, \(y_{t+h,t}\), of \(y_{t+h}\)
(b). \(h\)step path forecast, \(p_{t+h,t}\), of \(p_{t+h} = \{ y_{t+1}, y_{t+2}, ..., y_{t+h} \}\).
Clive Granger used to emphasize the distinction between (a) and (b).
As regards path forecasts, lately there's been some focus not on forecasting the entire path \(p_{t+h}\), but rather on forecasting the path average:
(c). \(h\)step path average forecast, \(a_{t+h,t}\), of \(a_{t+h} = 1/h [y_{t+1} + y_{t+2} + ... + y_{t+h}]\)
The leading case is forecasting "average growth", as in Mueller and Waston (2016).
Forecasting path averages (c) never resonated thoroughly with me. After all, (b) is sufficient for (c), but not conversely  the average is just one aspect of the path, and additional aspects (overall shape, etc.) might be of interest.
Then, listening to Ken West's FRB SL talk, my eyes opened. Of course the path average is insufficient for the whole path, but it's surely the most important aspect of the path  if you could know just one thing about the path, you'd almost surely ask for the average. Moreover  and this is important  it might be much easier to provide credible point, interval, and density forecasts of \(a_{t+h}\) than of \(p_{t+h}\).
So I still prefer full path forecasts when feasible/credible, but I'm now much more appreciative of path averages.
Wednesday, November 15, 2017
FRB St. Louis Forecasting Conference
Got back a couple days ago. Great lineup. Wonderful to see such sharp focus. Many thanks to FRBSL and the organizers (Domenico Giannone, George Kapetanios, and Mike McCracken). I'll hopefully blog on one or two of the papers shortly. Meanwhile, the program is here.
Wednesday, November 8, 2017
Artificial Intelligence, Machine Learning, and Productivity
As Bob Solow famously quipped, "You can see the computer age everywhere but in the productivity statistics". That was in 1987. The new "Artificial Intelligence and the Modern Productivity Paradox: A Clash of Expectations and Statistics," NBER w.p. 24001, by Brynjolfsson, Rock, and Syverson, brings us up to 2017. Still a puzzle. Fascinating. Ungated version here.
Sunday, November 5, 2017
Regression on Term Structures
An important insight regarding use of dynamic Nelson Siegel (DNS) and related termstructure modeling strategies (see here and here) is that they facilitate regression on an entire term structure. Regressing something on a curve might initially sound strange, or illposed. The insight, of course, is that DNS distills curves into level, slope, and curvature factors; hence if you know the factors, you know the whole curve. And those factors can be estimated and included in regressions, effectively enabling regression on a curve.
In a stimulating new paper, “The TimeVarying Effects of Conventional and Unconventional Monetary Policy: Results from a New Identification Procedure”, Atsushi Inoue and Barbara Rossi put that insight to very good use. They use DNS yield curve factors to explore the effects of monetary policy during the Great Recession. That monetary policy is often dubbed "unconventional" insofar as it involved the entire yield curve, not just a very short "policy rate".
I recently saw Atsushi present it at NBERNSF and Barbara present it at Penn's econometrics seminar. It was posted today, here.
In a stimulating new paper, “The TimeVarying Effects of Conventional and Unconventional Monetary Policy: Results from a New Identification Procedure”, Atsushi Inoue and Barbara Rossi put that insight to very good use. They use DNS yield curve factors to explore the effects of monetary policy during the Great Recession. That monetary policy is often dubbed "unconventional" insofar as it involved the entire yield curve, not just a very short "policy rate".
I recently saw Atsushi present it at NBERNSF and Barbara present it at Penn's econometrics seminar. It was posted today, here.
Sunday, October 29, 2017
What's up With "Fintech"?
It's been a while, so it's time for a rant (in this case gentle, with no names named).
Discussion of financial technology ("fintech", as it's called) seems to be everywhere these days, from business school fintech course offerings to highend academic fintech research conferences. I definitely get the business school thing  tech is cool with students now, and finance is cool with students now, and there are lots of highpaying jobs.
But I'm not sure I get the academic research thing. We can talk about "Xtech" for almost unlimited X: shopping, travel, learning, medicine, construction, sailing, ..., and yes, finance. It's all interesting, but is there something extra interesting about X=finance that elevates fintech to a higher level? Or elevates it to a serious and separate new research area? If there is, I don't know what it is, notwithstanding the cute name and all the recent publicity.
(Some earlier rants appear to the right, under Browse by Topic / Rants.)
Discussion of financial technology ("fintech", as it's called) seems to be everywhere these days, from business school fintech course offerings to highend academic fintech research conferences. I definitely get the business school thing  tech is cool with students now, and finance is cool with students now, and there are lots of highpaying jobs.
But I'm not sure I get the academic research thing. We can talk about "Xtech" for almost unlimited X: shopping, travel, learning, medicine, construction, sailing, ..., and yes, finance. It's all interesting, but is there something extra interesting about X=finance that elevates fintech to a higher level? Or elevates it to a serious and separate new research area? If there is, I don't know what it is, notwithstanding the cute name and all the recent publicity.
(Some earlier rants appear to the right, under Browse by Topic / Rants.)
Sunday, October 22, 2017
Pockets of Predictability
The possibility of localized "pockets of predictability", particularly in financial markets, is obviously intriguing. Recently I'm noticing a similarlyintriguing pocket of research on pockets of predictability.
The following paper, for example, was presented at 2017 the NBERNSF Time Series conference at Northwestern University, even if it is evidently not yet circulating:
The following paper, for example, was presented at 2017 the NBERNSF Time Series conference at Northwestern University, even if it is evidently not yet circulating:
"Pockets of Predictability", by Leland Farmer (UCSD), Lawrence Schmidt (Chicago), and Allan Timmermann (UCSD). Abstract: We show that return predictability in the U.S. stock market is a localized phenomenon, in which short periods, “pockets,” with significant predictability are interspersed with long periods with little or no evidence of return predictability. We explore possible explanations of this finding, including timevarying risk premia, and find that they are inconsistent with a general class of affine asset pricing models which allow for stochastic volatility and compound Poisson jumps. We find that pockets of return predictability can, however, be explained by a model of incomplete learning in which the underlying cash flow process is subject to change and investors update their priors about the current state. Simulations from the model demonstrate that investors’ learning about the underlying cash flow process can induce patterns that look, expost, like local return predictability, even in a model in which exante expected returns are constant.
And this one just appeared as an NBER w.p.: "Sparse Signals in the CrossSection of Returns", by Alexander M. Chinco, Adam D. ClarkJoseph, Mao Ye, NBER w.p. 23933, October 2017.
http://papers.nber.org/papers/w23933?utm_campaign=ntw&utm_medium=email&utm_source=ntw
Abstract: This paper applies the Least Absolute Shrinkage and Selection Operator (LASSO) to make rolling 1minuteahead return forecasts using the entire cross section of lagged returns as candidate predictors. The LASSO increases both outofsample fit and forecastimplied Sharpe ratios. And, this outofsample success comes from identifying predictors that are unexpected, shortlived, and sparse. Although the LASSO uses a statistical rule rather than economic intuition to identify predictors, the predictors it identifies are nevertheless associated with economically meaningful events: the LASSO tends to identify as predictors stocks with news about fundamentals.
Here's some associated work in dynamical systems theory: "A Mechanism for Pockets of Predictability in Complex Adaptive Systems", by Jorgen Vitting Andersen, Didier Sornette, Europhysics Letters, 2005. https://arxiv.org/abs/condmat/0410762
Abstract: We document a mechanism operating in complex adaptive systems leading to dynamical pockets of predictability ("prediction days''), in which agents collectively take predetermined courses of action, transiently decoupled from past history. We demonstrate and test it outofsample on synthetic minority and majority games as well as on real financial time series. The surprising large frequency of these prediction days implies a collective organization of agents and of their strategies which condense into transitional herding regimes.
There's even an ETH Zürich master's thesis: "In Search Of Pockets Of Predictability", by AT Morera, 2008
https://www.ethz.ch/content/dam/ethz/specialinterest/mtec/chairofentrepreneurialrisksdam/documents/dissertation/master%20thesis/Master_Thesis_Alan_Taxonera_Sept08.pdf
Finally, related ideas have appeared recently in the forecast evaluation literature, such as this paper and many of the references therein: "Testing for StateDependent Predictive Ability", by Sebastian Fossati, University of Alberta, September 2017.Abstract: We document a mechanism operating in complex adaptive systems leading to dynamical pockets of predictability ("prediction days''), in which agents collectively take predetermined courses of action, transiently decoupled from past history. We demonstrate and test it outofsample on synthetic minority and majority games as well as on real financial time series. The surprising large frequency of these prediction days implies a collective organization of agents and of their strategies which condense into transitional herding regimes.
There's even an ETH Zürich master's thesis: "In Search Of Pockets Of Predictability", by AT Morera, 2008
https://www.ethz.ch/content/dam/ethz/specialinterest/mtec/chairofentrepreneurialrisksdam/documents/dissertation/master%20thesis/Master_Thesis_Alan_Taxonera_Sept08.pdf
https://sites.ualberta.ca/~econwps/2017/wp201709.pdf
Abstract: This paper proposes a new test for comparing the outofsample forecasting performance of two competing models for situations in which the predictive content may be statedependent (for example, expansion and recession states or low and high volatility states). To apply this test the econometrician is not required to observe when the underlying states shift. The test is simple to implement and accommodates several different cases of interest. An outofsample forecasting exercise for US output growth using realtime data illustrates the improvement of this test over previous approaches to perform forecast comparison.
Saturday, October 14, 2017
Machine Learning and Macro
Earlier I posted here on machine learning and central banking. Here's something related.
Last week Penn's Warren Center hosted a timely and stimulating conference, "Machine Learning for Macroeconomic Prediction and Policy". The program appears below. Papers were not posted, but with a little Googling you should be able to obtain those that are available.
Conference on Machine Learning for Macroeconomic Prediction and Policy
October 12 and 13, 2017
Glandt Forum, Singh Center for Nanotechnology
CoSponsored by Penn’s Warren Center for Network and Data Sciences
and the Federal Reserve Bank of Philadelphia
Organizers: Michael Dotsey (FRBP), Jesus FernandezVillaverde (Penn), Michael Kearns (Penn)
SCHEDULE:
Thursday October 12:
8:00 Breakfast
8:45 Welcome
9:00 Stephen Hansen (University of Oxford): The LongRun Information Effect of Central Bank Text
9:45 Stephen Ryan (Washington University): Classi cation Trees for Heterogeneous MomentBased Models
10:30 Break
11:00 James Cowie (DeepMacro): DeepMacro Data Challenges
11:45 Galo Nuno (Banco de España): Machine Learning and Heterogeneous Agent Models
12:30 Lunch
1:30: Francis X. Diebold (Penn): Egalitarian LASSO for Combining Central Bank Survey Forecasts
2:15 Lyle Ungar (Penn): How to Make Better Forecasts
3:00 Vegard Larsen (Norges Bank): Components of Uncertainty
3:45 Break
4:15 Panel: ML and Econometrics: Similarities and Differences (Michael Kearns, Vegard Larsen, Stephen Hansen, Rakesh Vohra (Penn))
Friday October 13:
9:00 Aaron Smalter Hall (Federal Reserve Bank of Kansas City): Recession Forecasting with Bayesian Classification
9: 45 Susan Athey (Stanford GSB): Estimating Heterogeneity in Structural Parameters Using Generalized Random Forests
10:30 Break
11:00 Panel: ML Challenges at the Fed (Jose CanalsCerda (Philadelphia Fed), Galo Nuno, Jesus FernandezVillaverde, Aaron Smalter Hall)
12:30 Lunch
Departures
Last week Penn's Warren Center hosted a timely and stimulating conference, "Machine Learning for Macroeconomic Prediction and Policy". The program appears below. Papers were not posted, but with a little Googling you should be able to obtain those that are available.
Conference on Machine Learning for Macroeconomic Prediction and Policy
October 12 and 13, 2017
Glandt Forum, Singh Center for Nanotechnology
CoSponsored by Penn’s Warren Center for Network and Data Sciences
and the Federal Reserve Bank of Philadelphia
Organizers: Michael Dotsey (FRBP), Jesus FernandezVillaverde (Penn), Michael Kearns (Penn)
SCHEDULE:
Thursday October 12:
8:00 Breakfast
8:45 Welcome
9:00 Stephen Hansen (University of Oxford): The LongRun Information Effect of Central Bank Text
9:45 Stephen Ryan (Washington University): Classi cation Trees for Heterogeneous MomentBased Models
10:30 Break
11:00 James Cowie (DeepMacro): DeepMacro Data Challenges
11:45 Galo Nuno (Banco de España): Machine Learning and Heterogeneous Agent Models
12:30 Lunch
1:30: Francis X. Diebold (Penn): Egalitarian LASSO for Combining Central Bank Survey Forecasts
2:15 Lyle Ungar (Penn): How to Make Better Forecasts
3:00 Vegard Larsen (Norges Bank): Components of Uncertainty
3:45 Break
4:15 Panel: ML and Econometrics: Similarities and Differences (Michael Kearns, Vegard Larsen, Stephen Hansen, Rakesh Vohra (Penn))
Friday October 13:
9:00 Aaron Smalter Hall (Federal Reserve Bank of Kansas City): Recession Forecasting with Bayesian Classification
9: 45 Susan Athey (Stanford GSB): Estimating Heterogeneity in Structural Parameters Using Generalized Random Forests
10:30 Break
11:00 Panel: ML Challenges at the Fed (Jose CanalsCerda (Philadelphia Fed), Galo Nuno, Jesus FernandezVillaverde, Aaron Smalter Hall)
12:30 Lunch
Departures
Saturday, October 7, 2017
Long Memory in Realized Volatility
A noteworthy aspect of long memory in realized asset return volatility is that in many leading cases it's basically undeniable on the basis of a variety of evidence  the question isn't existence but rather strength. Hence it's useful to have a broad and comparable set of stateoftheart (local Whittle) estimates together in one place, as in the interesting paper below. For the most part it gets d in [.4, .6], consistent with my personal experience of d usually around .45, in the covariance stationary (finite variance) region d<.5, but close to the boundary.
http://d.repec.org/n?u=RePEc:han:dpaper:dp601&r=ecmDate:  201707 
By:  Wenger, Kai ; Leschinski, Christian ; Sibbertsen, Philipp 
The focus of the volatility literature on forecasting and the predominance of the conceptually simpler HAR model over long memory stochastic volatility models has led to the fact that the actual degree of memory estimates has rarely been considered. Estimates in the literature range roughly between 0.4 and 0.6  that is from the higher stationary to the lower nonstationary region. This difference, however, has important practical implications  such as the existence or nonexistence of the fourth moment of the return distribution. Inference on the memory order is complicated by the presence of measurement error in realized volatility and the potential of spurious long memory. In this paper we provide a comprehensive analysis of the memory in variances of international stock indices and exchange rates. On the one hand, we find that the variance of exchange rates is subject to spurious long memory and the true memory parameter is in the higher stationary range. Stock index variances, on the other hand, are free of low frequency contaminations and the memory is in the lower nonstationary range. These results are obtained using state of the art local Whittle methods that allow consistent estimation in presence of perturbations or low frequency contaminations.  
Keywords:  Realized Volatility; Long Memory; Perturbation; Spurious Long Memory 
JEL:  C12 C22 C58 G15 
URL:  http://d.repec.org/n?u=RePEc:han:dpaper:dp601&r=ecm 
Sunday, October 1, 2017
Economics Working Papers now in arXiv
Economics working papers are now a part of arXiv. This is great news, as arXiv is the premier working paper hosting platform in mathematics and the mathematical / statistical sciences. The Economics arXiv will start with a single subject area of Econometrics (econ.EM). More economics subject areas will be added (of course), and moreover, subject areas can and will be subdivided. Hats off to the econ.EM team (Victor Chernozhukov, MIT; Iván FernándezVal, Boston University; Marc Henry, Penn State; Francesca Molinari, Cornell; Jörg Stoye, Bonn & Cornell; Martin Weidner, University College London). The full announcement is here.
Sunday, September 24, 2017
Egalitarian LASSO for Forecast Combination
Here's a new one. It was something of a long and winding road. We introduce simple "egalitarian LASSO" procedures that set some combining weights to zero and shrink those remaining toward equality. The feasible versions don't work very well, due do difficulties associated with crossvalidating tuning parameters in small samples, but the lessons learned in studying the infeasible version turn out to be very valuable  indeed they directly motivate a new procedure, which we call "best <Naveraging", which solves the crossvalidation problem and performs intriguingly well.
Diebold, F.X. and Shin, M. (2017), “Beating the Simple Average: Egalitarian LASSO for Combining Economic Forecasts”, Penn Institute for Economic Research (PIER) Working Paper No. 17017, available at SSRN: https://ssrn.com/abstract=3032492.
Diebold, F.X. and Shin, M. (2017), “Beating the Simple Average: Egalitarian LASSO for Combining Economic Forecasts”, Penn Institute for Economic Research (PIER) Working Paper No. 17017, available at SSRN: https://ssrn.com/abstract=3032492.
Friday, September 22, 2017
National Bank of Poland
It strikes me that I'm seeing progressively more research in dynamic predictive modeling from the National Bank of Poland. A few recent examples appear below. Related information is here. Nice job.
Author  Title  
Karol Szafranek  Bagged artificial neural networks in forecasting inflation: An extensive comparison with current modelling frameworks  
Date

Number

Download
 
2017

262
 (PDF) 
Author  Title  
Siem Jan Koopman André Lucas Marcin Zamojski  Dynamic term structure models with scoredriven timevarying parameters: estimation and forecasting  
Date

Number

Download
 
2017

258
 (PDF) 
Author  Title  
Piotr Bańbuła Marcin Pietrzak  Early warning models of banking crises applicable to noncrisis countries  
Date

Number

Download
 
2017

257
 (PDF) 
Author  Title  
Alessia Paccagnini  Forecasting with FAVAR: macroeconomic versus financial factors  
Date

Number

Download
 
2017

256
 (PDF) 
Sunday, September 17, 2017
Machine Learning Meets Central Banking
Here's a nice new working paper from the Bank of England. There's nothing new methodologically, but there are three fascinating and detailed applications / case studies (banking supervision under imperfect information, UK CPI inflation forecasting, unicorns in financial technology). For your visual enjoyment I include their Figure 19 below. (It's the network graph for global technology startups in 2014, not spinart...)
Monday, September 11, 2017
2017 NBERNSF Time Series Meeting
Just back from 2017 NBERNSF Time Series at Northwestern. Quite a feast  my head is spinning. Program dumped below; formatted version here. Many thanks to the program committee for producing this event, and more generally for keeping the series going, year after year, stronger than ever. (See here for some history and links to past locations, programs, etc.)
The papers were very strong. Among those that I found particularly interesting are:
 Moon. Forecasting in short panels. You'd think it would be impossible since you need the individual effects. But it's not.
“Forecasting with Dynamic Panel Data Models”, Hyungsik Roger Moon (University of Southern California), Laura Liu, and Frank Schorfheide
 Shephard. Causal estimation meets time series.
“Time series experiments, causal estimands and exact pvalues”, Neil Shephard (Harvard University) and Iavor Bojinov
 The entire (and marvelouslycoherent) "Lumsdaine Sesssion" (Pruitt, Pelger, Giglio). Real progress on econometric methods for identifying financialmarket risk factors, with sharp empirical results.
“Instrumented Principal Component Analysis”, Seth Pruitt (Arizona State University), Bryan Kelly, and Yinan Su
“Estimating Latent AssetPricing Factors”, Markus Pelger (Stanford University) and Martin Lettau
“Inference on Risk Premia in the Presence of Omitted Factors”, Stefano Giglio (University of Chicago) and Dacheng Xiu

2017 NBERNSF Time Series Conference
Friday, September 8 – Saturday, September 9
Kellogg School of Management
Kellogg Global Hub
2211 N Campus Drive; Evanston, IL 60208
Friday, September 8
Registration begins 10:20am (White Auditorium)
Welcome and opening remarks: 10:50am
Session 1: 11:00am – 12:30pm
Chair: Ruey S. Tsay (University of Chicago)
“Egalitarian Lasso for Shrinkage and Selection in Forecast Combination” Francis X. Diebold (University of Pennsylvania) and Minchul Shin
“Forecasting with Dynamic Panel Data Models” Hyungsik Roger Moon (University of Southern California), Laura Liu, and Frank Schorfheide
“Large Vector Autoregressions with Stochastic Volatility and Flexible Priors” Andrea Carriero (Queen Mary University of London), Todd E. Clark, and Massimiliano Marcellino
12:30pm  2:00pm: Lunch and Poster Session 1 (Faculty Summit, 4th Floor)
“The Dynamics of Expected Returns: Evidence from MultiScale Time Series Modeling“ Daniele Bianchi (University of Warwick)
“Testing for Unitroot Nonstationarity against Threshold Stationarity” KungSik Chan (University of Iowa)
“Group Orthogonal Greedy Algorithm for Changepoint Estimation of Multivariate Time Series” Ngai Hang Chan (The Chinese University of Hong Kong)
“The Impact of Waiting Times on Volatility Filtering and Dynamic Portfolio Allocation” Dobrislav Dobrev (Federal Reserve Board of Governors)
“Testing for Mutually Exciting Jumps and Financial Flights in High Frequency Data” Mardi Dungey (University of Tasmania), Xiye Yang (Rutgers University) presenting
“Pockets of Predictability” Leland E. Farmer (University of California, San Diego)
“Factor Models of Arbitrary Strength” Simon Freyaldenhoven (Brown University)
“Inference for VARs Identified with Sign Restrictions” Eleonora Granziera (Bank of Finland)
“The TimeVarying Effects of Conventional and Unconventional Monetary Policy: Results from a New Identification Procedure” Atsushi Inoue (Vanderbilt University)
“On spectral density estimation via nonlinear wavelet methods for nonGaussian linear processes” Linyuan Li (University of New Hampshire)
“Multivariate Bayesian Predictive Synthesis in Macroeconomic Forecasting” Kenichiro McAlinn (Duke University)
“Periodic dynamic factor models: Estimation approaches and applications” Vladas Pipiras (University of North Carolina)
“Canonical stochastic cycles and bandpass filters for multivariate time series” Thomas M. Trimbur (U. S. Census Bureau)
Session 2: 2:00pm  3:30pm
Chair: Giorgio Primiceri (Northwestern University)
“Understanding the Sources of Macroeconomic Uncertainty” Tatevik Sekhposyan (Texas A&M University), Barbara Rossi, and Matthieu Soupre
“Safety, Liquidity, and the Natural Rate of Interest” Marco Del Negro (Federal Reserve Bank of New York), Domenico Giannone, Marc P. Giannoni, and Andrea Tambalotti
“Structural Interpretation of Vector Autoregressions with Incomplete Identification: Revisiting the Role of Oil Supply and Demand Shocks” Christiane Baumeister (University of Notre Dame) and James D. Hamilton
Afternoon Break: 3:30pm4:00pm
Session 3: 4:00pm – 5:30pm
Chair: Serena Ng (Columbia University)
“Controlling the Size of Autocorrelation Robust Tests” Benedikt M. Pötscher (University of Vienna) and David Preinerstorfer
“Heteroskedasticity Autocorrelation Robust Inference in Time Series” Regressions with Missing Data Timothy J. Vogelsang (Michigan State University) and SeungHwa Rho
“Time series experiments, causal estimands and exact pvalues” Neil Shephard (Harvard University) and Iavor Bojinov
5:30pm – 7pm: Cocktail Reception and Poster Session 2 (Faculty Summit, 4th Floor)
“Macro Risks and the Term Structure of Interest Rates” Andrey Ermolov (Fordham University)
“Holdingsbased Fund Performance Measures: Estimation and Inference” Wayne E. Ferson (University of Southern California), Junbo L. Wang (Louisiana State University) presenting
“Economic Predictions with Big Data: The Illusion of Sparsity” Domenico Giannone (Federal Reserve Bank of New York)
“Estimation and Inference of Dynamic Structural Factor Models with Overidentifying Restrictions” Xu Han (City University of Hong Kong)
“Bayesian Predictive Synthesis: Forecast Calibration and Combination” Matthew C. Johnson (Duke University)
“Time Series Modeling on Dynamic Networks” Jonas Krampe (TU Braunschweig)
“The Complexity of Bank Holding Companies: A Topological Approach” Robin L. Lumsdaine (American University)
“Sieve Estimation of Option Implied State Price Density” Zhongjun Qu (Boston University)  Junwen Lu (Boston University) presenting
“Linear Factor Models and the Estimation of Expected Returns” Cisil Sarisoy (Northwestern University)
“Efficient Parameter Estimation for Multivariate JumpDiffusions” Gustavo Schwenkler (Boston University)
“NewsDriven Uncertainty Fluctuations” Dongho Song (Boston College)
“Contagion, Systemic Risk and Diagnostic Tests in Large Mixed Panels” Cindy S.H. Wang (National Tsing Hua University and CORE, University Catholique de Louvain)
710pm: Dinner (White Auditorium)
Dinner speaker: Nobel Laureate Robert F. Engle
Saturday, September 9
Continental Breakfast: 8:00am – 8:30am
Registration begins 8:30am (White Auditorium)
Session 4: 9:00am – 10:30am
Chair: Thomas Severini (Northwestern University)
“Estimation of time varying covariance matrices for large datasets” Liudas Giraitis (Queen Mary University of London), Y. Dendramis, and G. Kapetanios
“Indirect Inference With(Out) Constraints” Eric Renault (Brown University) and David T. Frazier
“Edgeworth expansions for a class of spectral density estimators and their applications to interval estimation” S.N. Lahiri (North Carolina State University) and A. Chatterjee
Morning Break: 10:30am11:00am
Session 5: 11:00am12:30pm
Chair: Robin L. Lumsdaine (American University)
“Instrumented Principal Component Analysis” Seth Pruitt (Arizona State University), Bryan Kelly, and Yinan Su
“Estimating Latent AssetPricing Factors” Markus Pelger (Stanford University) and Martin Lettau
“Inference on Risk Premia in the Presence of Omitted Factors” Stefano Giglio (University of Chicago) and Dacheng Xiu
12:30pm2pm: Lunch and Poster Session 3 (Faculty Summit, 4th Floor)
“Regularizing Bayesian Predictive Regressions” Guanhao Feng (City University of Hong Kong)
“Good Jumps, Bad Jumps, and Conditional Equity Premium” Hui Guo (University of Cincinnati)
“Highdimensional Linear Regression for Dependent Observations with Application to Nowcasting” Yuefeng Han (The University of Chicago)
“Maximum Likelihood Estimation for Integervalued Asymmetric GARCH (INAGARCH) Models” Xiaofei Hu (BMO Harris Bank, N.A.)
“Tail Risk in Momentum Strategy Returns” Soohun Kim (Georgia Institute of Technology)
“The Perils of Counterfactual Analysis with Integrated Processes” Marcelo C. Medeiros (Pontifical Catholic University of Rio de Janeiro) and Ricardo Masini (Pontifical Catholic University of Rio de Janeiro)
“Anxious unit root processes” Jon Michel (The Ohio State University)
“Limiting Local Powers and Power Envelopes of Panel AR and MA Unit Root Tests” Katsuto Tanaka (Gakushuin University)
“HighFrequency CrossMarket Trading: Model Free Measurement and Applications”
Ernst Schaumburg (AQR Capital Management, LLC) – Dobrislav Dobrev (Federal Reserve Board of Governors) presenting
“A persistencebased Woldtype decomposition for stationary time series” Claudio Tebaldi (Bocconi University)
“Necessary and Sufficient Conditions for Solving Multivariate Linear Rational Expectations Models and Factoring Matrix Polynomials” Peter A. Zadrozny (Bureau of Labor Statistics)
Session 6: 2:00pm – 3:30pm
Chair: Beth Andrews (Northwestern University)
“Models for Time Series of Counts with Shape Constraints” Richard A. Davis (Columbia University) and Jing Zhang
“Computationally Efficient Distribution Theory for Bayesian Inference of HighDimensional Dependent CountValued Data” Scott H. Holan (University of Missouri, U.S. Census Bureau), Jonathan R. Bradley, and Christopher K. Wikle
“Functional Autoregression for Sparsely Sampled Data”
Daniel R. Kowal (Cornell University, Rice University)
The papers were very strong. Among those that I found particularly interesting are:
 Moon. Forecasting in short panels. You'd think it would be impossible since you need the individual effects. But it's not.
“Forecasting with Dynamic Panel Data Models”, Hyungsik Roger Moon (University of Southern California), Laura Liu, and Frank Schorfheide
 Shephard. Causal estimation meets time series.
“Time series experiments, causal estimands and exact pvalues”, Neil Shephard (Harvard University) and Iavor Bojinov
 The entire (and marvelouslycoherent) "Lumsdaine Sesssion" (Pruitt, Pelger, Giglio). Real progress on econometric methods for identifying financialmarket risk factors, with sharp empirical results.
“Instrumented Principal Component Analysis”, Seth Pruitt (Arizona State University), Bryan Kelly, and Yinan Su
“Estimating Latent AssetPricing Factors”, Markus Pelger (Stanford University) and Martin Lettau
“Inference on Risk Premia in the Presence of Omitted Factors”, Stefano Giglio (University of Chicago) and Dacheng Xiu

2017 NBERNSF Time Series Conference
Friday, September 8 – Saturday, September 9
Kellogg School of Management
Kellogg Global Hub
2211 N Campus Drive; Evanston, IL 60208
Friday, September 8
Registration begins 10:20am (White Auditorium)
Welcome and opening remarks: 10:50am
Session 1: 11:00am – 12:30pm
Chair: Ruey S. Tsay (University of Chicago)
“Egalitarian Lasso for Shrinkage and Selection in Forecast Combination” Francis X. Diebold (University of Pennsylvania) and Minchul Shin
“Forecasting with Dynamic Panel Data Models” Hyungsik Roger Moon (University of Southern California), Laura Liu, and Frank Schorfheide
“Large Vector Autoregressions with Stochastic Volatility and Flexible Priors” Andrea Carriero (Queen Mary University of London), Todd E. Clark, and Massimiliano Marcellino
12:30pm  2:00pm: Lunch and Poster Session 1 (Faculty Summit, 4th Floor)
“The Dynamics of Expected Returns: Evidence from MultiScale Time Series Modeling“ Daniele Bianchi (University of Warwick)
“Testing for Unitroot Nonstationarity against Threshold Stationarity” KungSik Chan (University of Iowa)
“Group Orthogonal Greedy Algorithm for Changepoint Estimation of Multivariate Time Series” Ngai Hang Chan (The Chinese University of Hong Kong)
“The Impact of Waiting Times on Volatility Filtering and Dynamic Portfolio Allocation” Dobrislav Dobrev (Federal Reserve Board of Governors)
“Testing for Mutually Exciting Jumps and Financial Flights in High Frequency Data” Mardi Dungey (University of Tasmania), Xiye Yang (Rutgers University) presenting
“Pockets of Predictability” Leland E. Farmer (University of California, San Diego)
“Factor Models of Arbitrary Strength” Simon Freyaldenhoven (Brown University)
“Inference for VARs Identified with Sign Restrictions” Eleonora Granziera (Bank of Finland)
“The TimeVarying Effects of Conventional and Unconventional Monetary Policy: Results from a New Identification Procedure” Atsushi Inoue (Vanderbilt University)
“On spectral density estimation via nonlinear wavelet methods for nonGaussian linear processes” Linyuan Li (University of New Hampshire)
“Multivariate Bayesian Predictive Synthesis in Macroeconomic Forecasting” Kenichiro McAlinn (Duke University)
“Periodic dynamic factor models: Estimation approaches and applications” Vladas Pipiras (University of North Carolina)
“Canonical stochastic cycles and bandpass filters for multivariate time series” Thomas M. Trimbur (U. S. Census Bureau)
Session 2: 2:00pm  3:30pm
Chair: Giorgio Primiceri (Northwestern University)
“Understanding the Sources of Macroeconomic Uncertainty” Tatevik Sekhposyan (Texas A&M University), Barbara Rossi, and Matthieu Soupre
“Safety, Liquidity, and the Natural Rate of Interest” Marco Del Negro (Federal Reserve Bank of New York), Domenico Giannone, Marc P. Giannoni, and Andrea Tambalotti
“Structural Interpretation of Vector Autoregressions with Incomplete Identification: Revisiting the Role of Oil Supply and Demand Shocks” Christiane Baumeister (University of Notre Dame) and James D. Hamilton
Afternoon Break: 3:30pm4:00pm
Session 3: 4:00pm – 5:30pm
Chair: Serena Ng (Columbia University)
“Controlling the Size of Autocorrelation Robust Tests” Benedikt M. Pötscher (University of Vienna) and David Preinerstorfer
“Heteroskedasticity Autocorrelation Robust Inference in Time Series” Regressions with Missing Data Timothy J. Vogelsang (Michigan State University) and SeungHwa Rho
“Time series experiments, causal estimands and exact pvalues” Neil Shephard (Harvard University) and Iavor Bojinov
5:30pm – 7pm: Cocktail Reception and Poster Session 2 (Faculty Summit, 4th Floor)
“Macro Risks and the Term Structure of Interest Rates” Andrey Ermolov (Fordham University)
“Holdingsbased Fund Performance Measures: Estimation and Inference” Wayne E. Ferson (University of Southern California), Junbo L. Wang (Louisiana State University) presenting
“Economic Predictions with Big Data: The Illusion of Sparsity” Domenico Giannone (Federal Reserve Bank of New York)
“Estimation and Inference of Dynamic Structural Factor Models with Overidentifying Restrictions” Xu Han (City University of Hong Kong)
“Bayesian Predictive Synthesis: Forecast Calibration and Combination” Matthew C. Johnson (Duke University)
“Time Series Modeling on Dynamic Networks” Jonas Krampe (TU Braunschweig)
“The Complexity of Bank Holding Companies: A Topological Approach” Robin L. Lumsdaine (American University)
“Sieve Estimation of Option Implied State Price Density” Zhongjun Qu (Boston University)  Junwen Lu (Boston University) presenting
“Linear Factor Models and the Estimation of Expected Returns” Cisil Sarisoy (Northwestern University)
“Efficient Parameter Estimation for Multivariate JumpDiffusions” Gustavo Schwenkler (Boston University)
“NewsDriven Uncertainty Fluctuations” Dongho Song (Boston College)
“Contagion, Systemic Risk and Diagnostic Tests in Large Mixed Panels” Cindy S.H. Wang (National Tsing Hua University and CORE, University Catholique de Louvain)
710pm: Dinner (White Auditorium)
Dinner speaker: Nobel Laureate Robert F. Engle
Saturday, September 9
Continental Breakfast: 8:00am – 8:30am
Registration begins 8:30am (White Auditorium)
Session 4: 9:00am – 10:30am
Chair: Thomas Severini (Northwestern University)
“Estimation of time varying covariance matrices for large datasets” Liudas Giraitis (Queen Mary University of London), Y. Dendramis, and G. Kapetanios
“Indirect Inference With(Out) Constraints” Eric Renault (Brown University) and David T. Frazier
“Edgeworth expansions for a class of spectral density estimators and their applications to interval estimation” S.N. Lahiri (North Carolina State University) and A. Chatterjee
Morning Break: 10:30am11:00am
Session 5: 11:00am12:30pm
Chair: Robin L. Lumsdaine (American University)
“Instrumented Principal Component Analysis” Seth Pruitt (Arizona State University), Bryan Kelly, and Yinan Su
“Estimating Latent AssetPricing Factors” Markus Pelger (Stanford University) and Martin Lettau
“Inference on Risk Premia in the Presence of Omitted Factors” Stefano Giglio (University of Chicago) and Dacheng Xiu
12:30pm2pm: Lunch and Poster Session 3 (Faculty Summit, 4th Floor)
“Regularizing Bayesian Predictive Regressions” Guanhao Feng (City University of Hong Kong)
“Good Jumps, Bad Jumps, and Conditional Equity Premium” Hui Guo (University of Cincinnati)
“Highdimensional Linear Regression for Dependent Observations with Application to Nowcasting” Yuefeng Han (The University of Chicago)
“Maximum Likelihood Estimation for Integervalued Asymmetric GARCH (INAGARCH) Models” Xiaofei Hu (BMO Harris Bank, N.A.)
“Tail Risk in Momentum Strategy Returns” Soohun Kim (Georgia Institute of Technology)
“The Perils of Counterfactual Analysis with Integrated Processes” Marcelo C. Medeiros (Pontifical Catholic University of Rio de Janeiro) and Ricardo Masini (Pontifical Catholic University of Rio de Janeiro)
“Anxious unit root processes” Jon Michel (The Ohio State University)
“Limiting Local Powers and Power Envelopes of Panel AR and MA Unit Root Tests” Katsuto Tanaka (Gakushuin University)
“HighFrequency CrossMarket Trading: Model Free Measurement and Applications”
Ernst Schaumburg (AQR Capital Management, LLC) – Dobrislav Dobrev (Federal Reserve Board of Governors) presenting
“A persistencebased Woldtype decomposition for stationary time series” Claudio Tebaldi (Bocconi University)
“Necessary and Sufficient Conditions for Solving Multivariate Linear Rational Expectations Models and Factoring Matrix Polynomials” Peter A. Zadrozny (Bureau of Labor Statistics)
Session 6: 2:00pm – 3:30pm
Chair: Beth Andrews (Northwestern University)
“Models for Time Series of Counts with Shape Constraints” Richard A. Davis (Columbia University) and Jing Zhang
“Computationally Efficient Distribution Theory for Bayesian Inference of HighDimensional Dependent CountValued Data” Scott H. Holan (University of Missouri, U.S. Census Bureau), Jonathan R. Bradley, and Christopher K. Wikle
“Functional Autoregression for Sparsely Sampled Data”
Daniel R. Kowal (Cornell University, Rice University)
Monday, September 4, 2017
More on New pValue Thresholds
I recently blogged on a new proposal heavily backed by elite statisticians to "redefine statistical significance", forthcoming in the elite journal Nature Human Behavior. (A link to the proposal appears at the end of this post.)
I have a bit more to say. It's not just that I find the proposal counterproductive; I have to admit that I also find it annoying, bordering on offensive.
I find it inconceivable that the authors' p<.005 recommendation will affect their own behavior, or that of others like them. They're all skilled statisticians, hardly so naive as to declare a "discovery" simply because a pvalue does or doesn't cross a magic threshold, whether .05 or .005. Serious evaluations and interpretations of statistical analyses by serious statisticians are much more nuanced and rich  witness the extended and oftenheated discussion in any good applied statistics seminar.
If the p<.005 threshold won't change the behavior of skilled statisticians, then whose behavior MIGHT it change? That is, reading between the lines, to whom is the proposal REALLY addressed? Evidently those much less skilled, the proverbial "practitioners", who the authors evidently hope to keep out of trouble by providing a rule of thumb that can at least be followed mechanically.
How patronizing.

Redefine Statistical Significance
I have a bit more to say. It's not just that I find the proposal counterproductive; I have to admit that I also find it annoying, bordering on offensive.
I find it inconceivable that the authors' p<.005 recommendation will affect their own behavior, or that of others like them. They're all skilled statisticians, hardly so naive as to declare a "discovery" simply because a pvalue does or doesn't cross a magic threshold, whether .05 or .005. Serious evaluations and interpretations of statistical analyses by serious statisticians are much more nuanced and rich  witness the extended and oftenheated discussion in any good applied statistics seminar.
If the p<.005 threshold won't change the behavior of skilled statisticians, then whose behavior MIGHT it change? That is, reading between the lines, to whom is the proposal REALLY addressed? Evidently those much less skilled, the proverbial "practitioners", who the authors evidently hope to keep out of trouble by providing a rule of thumb that can at least be followed mechanically.
How patronizing.

Redefine Statistical Significance
Date: 2017
By:
Daniel Benjamin ; James Berger ; Magnus Johannesson ; Brian Nosek ; E. Wagenmakers ; Richard Berk ; Kenneth Bollen ; Bjorn Brembs ; Lawrence Brown ; Colin Camerer ; David Cesarini ; Christopher Chambers ; Merlise Clyde ; Thomas Cook ; Paul De Boeck ; Zoltan Dienes ; Anna Dreber ; Kenny Easwaran ; Charles Efferson ; Ernst Fehr ; Fiona Fidler ; Andy Field ; Malcom Forster ; Edward George ; Tarun Ramadorai ; Richard Gonzalez ; Steven Goodman ; Edwin Green ; Donald Green ; Anthony Greenwald ; Jarrod Hadfield ; Larry Hedges ; Leonhard Held ; Teck Hau Ho ; Herbert Hoijtink ; James Jones ; Daniel Hruschka ; Kosuke Imai ; Guido Imbens ; John Ioannidis ; Minjeong Jeon ; Michael Kirchler ; David Laibson ; John List ; Roderick Little ; Arthur Lupia ; Edouard Machery ; Scott Maxwell; Michael McCarthy ; Don Moore ; Stephen Morgan ; Marcus Munafo ; Shinichi Nakagawa ; Brendan Nyhan ; Timothy Parker ; Luis Pericchi; Marco Perugini ; Jeff Rouder ; Judith Rousseau ; Victoria Savalei ; Felix Schonbrodt ; Thomas Sellke ; Betsy Sinclair ; Dustin Tingley; Trisha Zandt ; Simine Vazire ; Duncan Watts; Christopher Winship ; Robert Wolpert ; Yu Xie; Cristobal Young ; Jonathan Zinman ; Valen Johnson
Abstract: We propose to change the default Pvalue threshold for statistical significance for claims of new discoveries from 0.05 to 0.005.
http://d.repec.org/n?u=RePEc:feb:artefa:00612&r=ecm
By:
Daniel Benjamin ; James Berger ; Magnus Johannesson ; Brian Nosek ; E. Wagenmakers ; Richard Berk ; Kenneth Bollen ; Bjorn Brembs ; Lawrence Brown ; Colin Camerer ; David Cesarini ; Christopher Chambers ; Merlise Clyde ; Thomas Cook ; Paul De Boeck ; Zoltan Dienes ; Anna Dreber ; Kenny Easwaran ; Charles Efferson ; Ernst Fehr ; Fiona Fidler ; Andy Field ; Malcom Forster ; Edward George ; Tarun Ramadorai ; Richard Gonzalez ; Steven Goodman ; Edwin Green ; Donald Green ; Anthony Greenwald ; Jarrod Hadfield ; Larry Hedges ; Leonhard Held ; Teck Hau Ho ; Herbert Hoijtink ; James Jones ; Daniel Hruschka ; Kosuke Imai ; Guido Imbens ; John Ioannidis ; Minjeong Jeon ; Michael Kirchler ; David Laibson ; John List ; Roderick Little ; Arthur Lupia ; Edouard Machery ; Scott Maxwell; Michael McCarthy ; Don Moore ; Stephen Morgan ; Marcus Munafo ; Shinichi Nakagawa ; Brendan Nyhan ; Timothy Parker ; Luis Pericchi; Marco Perugini ; Jeff Rouder ; Judith Rousseau ; Victoria Savalei ; Felix Schonbrodt ; Thomas Sellke ; Betsy Sinclair ; Dustin Tingley; Trisha Zandt ; Simine Vazire ; Duncan Watts; Christopher Winship ; Robert Wolpert ; Yu Xie; Cristobal Young ; Jonathan Zinman ; Valen Johnson
Abstract: We propose to change the default Pvalue threshold for statistical significance for claims of new discoveries from 0.05 to 0.005.
http://d.repec.org/n?u=RePEc:feb:artefa:00612&r=ecm
Sunday, August 27, 2017
New pValue Thresholds for Statistical Significance
This is presently among the hottest topics / discussions / developments in statistics. Seriously. Just look at the abstract and dozens of distinguished authors of the paper below, which is forthcoming in one of the world's leading science outlets, Nature Human Behavior.
Of course data mining, or overfitting, or whatever you want to call it, has always been a problem, warranting strong and healthy skepticism regarding alleged "new discoveries". But the whole point of examining pvalues is to AVOID anchoring on arbitrary significance thresholds, whether the old magic .05 or the newlyproposed magic .005. Just report the pvalue, and let people decide for themselves how they feel. Why obsess over asterisks, and whether/when to put them next to things?
Postscript:
Reading the paper, which I had not done before writing the paragraph above (there's largely no need, as the wonderfully concise abstract says it all), I see that it anticipates my objection at the end of a section entitled "potential objections":
The paper offers only a feeble refutation of that "potential" objection:
Of course data mining, or overfitting, or whatever you want to call it, has always been a problem, warranting strong and healthy skepticism regarding alleged "new discoveries". But the whole point of examining pvalues is to AVOID anchoring on arbitrary significance thresholds, whether the old magic .05 or the newlyproposed magic .005. Just report the pvalue, and let people decide for themselves how they feel. Why obsess over asterisks, and whether/when to put them next to things?
Postscript:
Reading the paper, which I had not done before writing the paragraph above (there's largely no need, as the wonderfully concise abstract says it all), I see that it anticipates my objection at the end of a section entitled "potential objections":
Changing the significance threshold is a distraction from the real solution, which is to replace null hypothesis significance testing (and brightline thresholds) with more focus on effect sizes and confidence intervals, treating the Pvalue as a continuous measure, and/or a Bayesian method.Here here! Marvelously well put.
The paper offers only a feeble refutation of that "potential" objection:
Many of us agree that there are better approaches to statistical analyses than null hypothesis significance testing, but as yet there is no consensus regarding the appropriate choice of replacement. ... Even after the significance threshold is changed, many of us will continue to advocate for alternatives to null hypothesis significance testing.I'm all for advocating alternatives to significance testing. That's important and helpful. As for continuing to promulgate significance testing with magic significance thresholds, whether .05 or .005, well, you can decide for yourself.
Redefine Statistical Significance
Friday, August 25, 2017
Flipping the https Switch
I just flipped a switch to convert No Hesitations from http to https, which should be totally inconsequential to you  you should not need to do anything, but obviously let me know if your browser chokes. The switch will definitely solve one problem: Chrome has announced that it will soon REQUIRE https. Moreover, the switch may help with another problem. There have been issues over the years with certain antivirus software blocking No Hesitations without a manual override. The main culprit seems to be Kaspersky Antivirus. Maybe that will now stop.
Sunday, August 20, 2017
Bayesian Random Projection (More on Terabytes of Economic Data)
Some additional thoughts related to Serena Ng's World Congress piece (earlier post here, with a link to her paper):
The key newish dimensionalityreduction strategies that Serena emphasizes are random projection and leverage score sampling. In a regression context both are methods for optimally approximating an NxK "X matrix" with an Nxk X matrix, where k<<K. They are very different and there are many issues. Random projection delivers a smaller X matrix with columns that are linear combinations of those of the original X matrix, as for example with principalcomponent regression, which can sometimes make for difficult interpretation. Leverage score sampling, in contrast, delivers a smaller X matrix with columns that are simply a subset of those of those of the original X matrix, which feels cleaner but has issues of its own.
Anyway, a crucial observation is that for successful predictive modeling we don't need deep interpretation, so random projection is potentially just fine  if it works, it works, and that's an empirical matter. Econometric extensions (e.g., to VAR's) and evidence (e.g., to macro forecasting) are just now emerging, and the results appear encouraging. An important recent contribution in that regard is Koop, Korobilis, and Pettenuzzo (in press), which significantly extends and applies earlier work of Guhaniyogi and Dunson (2015) on Bayesian random projection ("compression"). Bayesian compression fits beautifully in a MCMC framework (again see Koop et al.), including model averaging across multiple random projections, attaching greater weight to projections that forecast well. Very exciting!
The key newish dimensionalityreduction strategies that Serena emphasizes are random projection and leverage score sampling. In a regression context both are methods for optimally approximating an NxK "X matrix" with an Nxk X matrix, where k<<K. They are very different and there are many issues. Random projection delivers a smaller X matrix with columns that are linear combinations of those of the original X matrix, as for example with principalcomponent regression, which can sometimes make for difficult interpretation. Leverage score sampling, in contrast, delivers a smaller X matrix with columns that are simply a subset of those of those of the original X matrix, which feels cleaner but has issues of its own.
Anyway, a crucial observation is that for successful predictive modeling we don't need deep interpretation, so random projection is potentially just fine  if it works, it works, and that's an empirical matter. Econometric extensions (e.g., to VAR's) and evidence (e.g., to macro forecasting) are just now emerging, and the results appear encouraging. An important recent contribution in that regard is Koop, Korobilis, and Pettenuzzo (in press), which significantly extends and applies earlier work of Guhaniyogi and Dunson (2015) on Bayesian random projection ("compression"). Bayesian compression fits beautifully in a MCMC framework (again see Koop et al.), including model averaging across multiple random projections, attaching greater weight to projections that forecast well. Very exciting!
Monday, August 14, 2017
Analyzing Terabytes of Economic Data
Serena Ng's World Congress piece is out as an NBER w.p. It's been floating around for a long time, but just in case you missed it, it's a fun and insightful read:
Opportunities and Challenges: Lessons from Analyzing Terabytes of Scanner Data
by Serena Ng  NBER Working Paper #23673.
http://papers.nber.org/papers/w23673
(Ungated copy at http://www.columbia.edu/~sn2294/papers/sngworldcongress.pdf)
Abstract:
This paper seeks to better understand what makes big data analysis different, what we can and cannot do with existing econometric tools, and what issues need to be dealt with in order to work with the data efficiently. As a case study, I set out to extract any business cycle information that might exist in four terabytes of weekly scanner data. The main challenge is to handle the volume, variety, and characteristics of the data within the constraints of our computing environment. Scalable and efficient algorithms are available to ease the computation burden, but they often have unknown statistical properties and are not designed for the purpose of efficient estimation or optimal inference. As well, economic data have unique characteristics that generic algorithms may not accommodate. There is a need for computationally efficient econometric methods as big data is likely here to stay.
Opportunities and Challenges: Lessons from Analyzing Terabytes of Scanner Data
by Serena Ng  NBER Working Paper #23673.
http://papers.nber.org/papers/w23673
(Ungated copy at http://www.columbia.edu/~sn2294/papers/sngworldcongress.pdf)
Abstract:
This paper seeks to better understand what makes big data analysis different, what we can and cannot do with existing econometric tools, and what issues need to be dealt with in order to work with the data efficiently. As a case study, I set out to extract any business cycle information that might exist in four terabytes of weekly scanner data. The main challenge is to handle the volume, variety, and characteristics of the data within the constraints of our computing environment. Scalable and efficient algorithms are available to ease the computation burden, but they often have unknown statistical properties and are not designed for the purpose of efficient estimation or optimal inference. As well, economic data have unique characteristics that generic algorithms may not accommodate. There is a need for computationally efficient econometric methods as big data is likely here to stay.
Saturday, August 12, 2017
On Theory, Measurement, and Lewbel's Assertion
Arthur Lewbel, insightful as always, asserts in a recent post that:
[Related earlier posts: "Big Data the Big Hassle" and "Theory gets too Much Respect, and Measurement Doesn't get Enough"]
The people who argue that machine learning, natural experiments, and randomized controlled trials are replacing structural economic modeling and theory are wronger than wrong.
As ML and experiments uncover ever more previously unknown correlations and connections, the desire to understand these newfound relationships will rise, thereby increasing, not decreasing, the demand for structural economic theory and models.I agree. New measurement produces new theory, and new theory produces new measurement  it's hard to imagine stronger complements. And as I said in an earlier post,
Measurement and theory are rarely advanced at the same time, by the same team, in the same work. And they don't need to be. Instead we exploit the division of labor, as we should. Measurement can advance significantly with little theory, and theory can advance significantly with little measurement. Still each disciplines the other in the long run, and science advances.The theory/measurement pendulum tends to swing widely. If the 1970's and 1980's were a golden age of economic theory, recent decades have witnessed explosive advances in economic measurement linked to the explosion of Big Data. But Big Data presents both measurement opportunities and pitfalls  dense fogs of "digital exhaust"  which fresh theory will help us penetrate. Theory will be back.
[Related earlier posts: "Big Data the Big Hassle" and "Theory gets too Much Respect, and Measurement Doesn't get Enough"]
Saturday, August 5, 2017
Commodity Connectedness
Forthcoming paper here.
We study connectedness among the major commodity markets, summarizing and visualizing the results using tools from network science.
Among other things, the results reveal clear clustering of commodities into groups closely related to the traditional industry taxonomy, but with some notable differences.
Many thanks to Central Bank of Chile for encouraging and supporting the effort via its 2017 Annual Research Conference.
We study connectedness among the major commodity markets, summarizing and visualizing the results using tools from network science.
Among other things, the results reveal clear clustering of commodities into groups closely related to the traditional industry taxonomy, but with some notable differences.
Many thanks to Central Bank of Chile for encouraging and supporting the effort via its 2017 Annual Research Conference.
Sunday, July 30, 2017
Regression Discontinuity and Event Studies in Time Series
Check out the new paper, "Regression Discontinuity in Time [RDiT]: Considerations for Empirical Applications", by Catherine Hausman and David S. Rapson. (NBER Working Paper No. 23602, July 2017. Ungated copy here.)
It's interesting in part because it documents and contributes to the largely crosssection regression discontinuity design literature's awakening to time series. But the elephant in the room is the large timeseries "event study" (ES) literature, mentioned but not emphasized by Hausman and Rapson. [In a onesentence nutshell, here's how an ES works: model the preevent period, use the fitted preevent model to predict the postevent period, and ascribe any systematic forecast error to the causal impact of the event.] ES's trace to the classic Fama et al. (1969). Among many others, MacKinlay's 1997 overview is still fresh, and Gürkaynak and Wright (2013) provide additional perspective.
One question is what the RDiT approach adds to the ES approach, and related, what it adds to welldeveloped timeseries toolkit of other methods for assessing structural change. At present, and notwithstanding the HausmanRapson paper, my view is "little or nothing". Indeed in most respects it would seem that a RDiT study *is* an ES, and conversely. So call it what you will, "ES" or "RDiT".
But there are important open issues in ES / RDiT, and HausmanRapson correctly emphasize one of them, namely issues and difficulties associated with "wide" pre and postevent windows, which is often the relevant case in time series.
Things are generally "easy" in cross sections, where we can usually take narrow windows (e.g., in the classic scholarship exam example, we use only test scores very close to the scholarship threshold). Things are similarly "easy" in time series *IF* we can take similarly narrow windows (e.g., highfrequency asset return data facilitate taking narrow pre and postevent windows in financial applications). In such cases it's comparatively easy to credibly ascribe a postevent break to the causal impact of the event.
But in other timeseries areas like macro and environmental, we might want (or need) to use wide pre and postevent windows. Then the trick becomes modeling the pre and postevent periods successfully enough so that we can credibly assert that any structural change is due exclusively to the event  very challenging, but not hopeless.
Hats off to Hausman and Rapson for beginning to bridge the ES and regression discontinuity literatures, and for implicitly helping to push the ES literature forward.
It's interesting in part because it documents and contributes to the largely crosssection regression discontinuity design literature's awakening to time series. But the elephant in the room is the large timeseries "event study" (ES) literature, mentioned but not emphasized by Hausman and Rapson. [In a onesentence nutshell, here's how an ES works: model the preevent period, use the fitted preevent model to predict the postevent period, and ascribe any systematic forecast error to the causal impact of the event.] ES's trace to the classic Fama et al. (1969). Among many others, MacKinlay's 1997 overview is still fresh, and Gürkaynak and Wright (2013) provide additional perspective.
But there are important open issues in ES / RDiT, and HausmanRapson correctly emphasize one of them, namely issues and difficulties associated with "wide" pre and postevent windows, which is often the relevant case in time series.
Things are generally "easy" in cross sections, where we can usually take narrow windows (e.g., in the classic scholarship exam example, we use only test scores very close to the scholarship threshold). Things are similarly "easy" in time series *IF* we can take similarly narrow windows (e.g., highfrequency asset return data facilitate taking narrow pre and postevent windows in financial applications). In such cases it's comparatively easy to credibly ascribe a postevent break to the causal impact of the event.
But in other timeseries areas like macro and environmental, we might want (or need) to use wide pre and postevent windows. Then the trick becomes modeling the pre and postevent periods successfully enough so that we can credibly assert that any structural change is due exclusively to the event  very challenging, but not hopeless.
Hats off to Hausman and Rapson for beginning to bridge the ES and regression discontinuity literatures, and for implicitly helping to push the ES literature forward.
Subscribe to:
Posts (Atom)