88th International Atlantic Economic Conference
October 17 - 20, 2019 | Miami, USA

Machine-based interpolation of wealth and income inequality data

Saturday, 19 October 2019: 4:30 PM
James Chen, J.D. , College of Law, Michigan State University, East Lansing, MI
National and regional economic reports routinely describe wealth and income inequality in terms of the Gini coefficient. Though intuitive and readily understood even by unsophisticated consumers of economic data (such as journalists and ordinary businessmen and businesswomen), the Gini coefficient does not convey full information about the shape of a wealth or income distribution. Because each value of the Gini coefficient, besides the trivial cases of 0 (perfect equality) or 1 (perfect inequality) describes a family of inequality distributions, proper use of this measure should take account of the shape of inequality as well as its scalar value.

Some measure of the internal asymmetry of the distribution — akin to skewness in ordinary probability distributions — is needed to convey the extent to which inequality is properly attributed to the large number of impoverished people, or alternatively to the concentration of wealth or income at the top of the distribution. Although both effects are present in any distribution besides the trivial instance of perfect equality, nearly all distributions are tilted in one direction or the other. Such internal asymmetry can be described through either differential or integral calculus.

Proper application of the calculus, however, requires interpolation. Data describing wealth and income inequality is often reported in very crude quantiles, often no more precise than deciles or even quartiles. It can be proved that polynomial representations of the Lorenz curve used to calculate a Gini coefficient, as typically used to illustrate Pareto distributions of wealth on income, systematically portray inequality as a function of the size of the impoverished population rather than the concentration of wealth. Polynomial interpolation, which is often used to describe macroeconomic data, such as the term structure of government bonds, is therefore inappropriate.

This presentation will demonstrate the use of the Python programming language to interpolate inequality data to a high degree of precision (whether measured by absolute error or squared error). Though machine-based, the technique presented here enables the Lorenz curve to be expressed in closed form and, consequently, to be evaluated through conventional tools of the calculus.