Model choice and size distribution: A Bayequentist approach

Engler, John-Oliver

Objectives/Motivation

There is an ongoing debate in empirical findings between Robert Gibrat’s Law of Proportionate Effect (Gibrat, 1931) and the so-called Rank-Size Rule (Pareto, 1896) when it comes to size distributions in economic systems. Examples of systems where either pattern has been identified are very diverse: firms, cities or incomes. Yet, they all seem to have something in common that gives rise to – at least – very similar patterns over and over. While Gibrat’s Law predicts a lognormal distribution, the Rank-Size Rule predicts a power law for the upper tail of the distribution. Since both patterns have been repeatedly found in empirical data, there is a lively debate about validity and implications of these findings. Given this background, we ask two main questions: (1) How exactly can we identify a certain size distribution model from empirical data? (2) What mechanisms on the individual level could explain the observed overall size distribution?

Data & Methods

This paper investigates a unique data set of 399 cattle farming businesses in the semi-arid regions of Namibia. The data set (Olbrich et al., 2009) captures variables such as farm size – measured by cattle count on farm and in hectares – and exogenous environmental risk as given by the variability of annual rain fall.

For data analysis, we propose a new three step model selection framework for size distributions in empirical data that combines frequentist and Bayesian statistical methods which we refer to as `Bayequentist’. We generalize a recent non-standard unbiased frequentist hypothesis test for the detection of power laws (Clauset et al., 2007). The generalized version is capable of testing for any size distribution model in empirical data given that there exists a way to generate random numbers from the hypothesized distribution (step 1). From Akaike’s Information Criterion, we calculate the evidence in favor of each model in light of the actual data to assess the relative goodness-of-fit of the models that have passed our test (step 2). In the last step, we promote the criterion of stochastic micro foundation by assessing the explanatory power and depth of the remaining models. Hence, steps 1 and 2 assess statistical significance while step 3 rates economic significance.

Results

With help of our framework, we infer that the double Pareto lognormal distribution (Reed & Jorgensen, 2004) provides the best fit to our data. This finding implies that Gibrat's Law of Proportionate Effect holds true for our sample, albeit with two modifying assumptions: (1) farm ages are exponentially distributed and (2) the initial farm size distribution was lognormal. Our conclusion is threefold: (1) smaller farms do not grow significantly faster or slower than larger farms, (2) environmental risk is a key force in farm size growth much rather than just a minor element. (3) The framework proposed here has the potential to reconcile the ongoing debate about size distribution models in empirical socioeconomic data, the two most prominent of which are the Pareto and the lognormal distribution.

74th International Atlantic Economic Conference

October 04 - 07, 2012 | Montréal, Canada

Model choice and size distribution: A Bayequentist approach