OWL differs from other machine learning tools in the way that it specifically allows for high correlations among anomaly variables- an important fact that is often neglected, which can cause severe complications in standard analytical tools like ordinary least squares (OLS) regression. I derive an appealing statistical property of OWL, that it can group together highly correlated factors by assigning them with the same coefficients. Monte Carlo simulation shows that OWL outperforms some alternative machine learning tools like LASSO and adaptive LASSO in a highly correlated setting. I also propose a select-and-test procedure to choose useful factors from the “factor zoo.” In the first step I use the OWL to obtain a sparse model. In the second step I employ a bootstrap testing procedure, in which I bootstrap the null hypothesis directly to validate each factor selected by OWL.
Empirically, I construct 81 anomaly variables through the Center for Research in Security Prices (CRSP) and COMPUSTAT databases from over 2 million firm level observations, which cover 4506 stocks and range from January 1980 to December 2017. I use the value-weighted method, excluding micro stocks to compute the spread portfolios. I then construct 5 by 5 bi-variable sorted testing portfolios following that of Fama and French (1993, 2015). Specifically, I use “size” together with each of the remaining 80 anomaly variables to construct 25 portfolios and pool them together as testing portfolios. OWL estimation and bootstrap testing suggest that some liquidity factors play an important role in explaining the cross-sectional average returns in the stock market, and some profitability / investment factors are also significant in explaining average returns.
Further robustness checks show that OWL implied models outperform some prominent benchmarks like the Fama-French 5-factor , Carhart 4-factor and Q-theory models, on a broad range of criterion using the Sharpe Ratio, Hansen-Jagannathan Distance, Cross-sectional R squared, and GRS (Gibbons-Ross-Shanken) statistics.