The Colonial Origins of Comparative Development:

An Empirical Investigation Abstract


Demetria Parisi,
Francesca Ventimiglia,
Sila Sahin.


A summary of the paper by Daron Acemoglu, Simon Johnson, and James A. Robinson


Abstract

How can the vast income disparities across the different countries of the world be explained? What are the causes of such phenomenon? In the following document, we will reproduce the work of Acemoglu D. et al. and estimate how the differences in institutions affect the economic performance of different countries measured by income (GDP) per capita. For this purpose, the authors use the disparity in mortality rates in countries colonized by Europeans as an instrumental variable to identify the exogenous variation in current institutions across countries. Indeed, the diverse colonization policies adopted by European colonizers resulted in differing institution types which persist to the present day. In countries where the mortality rate was high, for instance, the colonizers preferred to set up extractive institutions instead of settling. Hence, a two-stage least squares model is applied in which the differences in mortality rates are utilized as a source of exogenous variation in institution quality. Former colonies with high quality institutions are expected to have a better current economic performance as compared to the ones with merely extractive or low quality institutions since good institutions lead to higher investment in capital and more efficient use of resources. These conditions, in turn, are presumed to produce a higher level of income. The reproduced paper seeks to establish this relationship.

Research question

Using the instrumental variable approach with mortality rates in European colonies as an instrumental variable, what is the estimated effect of current quality of institutions on income per capita and therefore economic performance? In this context, our research question arises as, “What are the fundamental causes of the large differences in income per capita across countries?”.

Motivation

The Colonial Origins of Comparative Development: An Empirical Investigation by Acemoglu et al. is one of the most essential and influential scientific works in its field. It has been subject to numerous follow-up analysis and the basis of further research. Our motivation behind choosing this work was to illustrate a significant economic paper using the python program. An evaluation of the effect of institutions on economic performance impacts current policy decisions, which could possibly yield to an improvement in the level of income per capita in countries struggling with their economic performance. We try to establish this coherence by showing that improving institutions in low income countries will benefit the region's GDP per capita by a large and statistically significant amount. Moreover, this study allows to further research on the most appropriate ways to improve institutions by correcting for specific aspects such as property rights enforcement, rule of law and other institutional features.

Assumptions

The model we are about to develop is based on three main assumptions:

To determine the variable current institutions, the authors use the protection against risk of expropriation index from Political Risk Services.

Method

The choice of using an instrumental variable approach stems from the idea that early institutions can be related to a number of variables which could influence the estimated effect. Countries may differ in their level of income per capita for a variety of reasons, including cultural and geographical factors. By instrumentalizing mortality rates, we ensure that variation in GDP across countries results solely from the countries' institutional characteristics. The idea behind the presented model can be summarized as follows: the settlers' mortality rates influence the settlement decision and the colonization strategy; the implemented policy will then result in low or high quality of early institutions, which will be reflected in current institutions and economic performance. The first stage of the model will regress the level of current institutions, which is the treatment variable, on settler mortality rates, the instrumental variable. In the second stage, the outcome of interest, here the economic performance, will be regressed on the resulting exogenous variation in the level of current institutions.

Descriptive Statistics

Before conducting the regression analysis, we take a look at the data and its properties. The sample contains data for every country in the world, comprising information on settler mortality, protection against expropriation risk, and PPP adjusted income (GDP) per capita in 1995. In the paper, the authors base their analysis on a subsample limited to 64 ex-colonies. Since we do not have access to detailed information concerning the subsample, we conduct our regression analysis on the given worldwide sample. The fact that the sample used by Acemoglu et al. is drawn from the data that we utilize, promises valid regression results approximating the findings of the paper. The variable describing the protection against expropriation risk serves to pick up institutional differences in the various countries. The graphical illustration of the variable suggests that the majority of countries tend to have higher quality institutions with more property rights. Regarding the income levels across countries according to the World Bank for the year 1995, the PPP adjusted GDP distribution nearly follows a normal distribution with most countries having an average income level, few countries having an extremely low GDP and some countries having an extremely high GDP level. Note that further information on the data and its sources can be found in Appendix Table A1, a table retrieved from the original paper.

In [2]:
import math
import numpy as np
import pandas as pd
from scipy import arange, optimize
import matplotlib.pyplot as plt
%matplotlib inline
In [3]:
Data_descriptive_stats = pd.read_stata ('descriptive-statistics.dta')
In [4]:
Data_descriptive_stats
Out[4]:
shortnam euro1900 excolony avexpr logpgp95 cons1 cons90 democ00a cons00a extmort4 logem4 loghjypl baseco
0 AFG 0.000000 1.0 NaN NaN 1.0 2.0 1.0 1.0 93.699997 4.540098 NaN NaN
1 AGO 8.000000 1.0 5.363636 7.770645 3.0 3.0 0.0 1.0 280.000000 5.634789 -3.411248 1.0
2 ARE 0.000000 1.0 7.181818 9.804219 NaN NaN NaN NaN NaN NaN NaN NaN
3 ARG 60.000004 1.0 6.386364 9.133459 1.0 6.0 3.0 3.0 68.900002 4.232656 -0.872274 1.0
4 ARM 0.000000 0.0 NaN 7.682482 NaN NaN NaN NaN NaN NaN NaN NaN
5 AUS 98.000000 1.0 9.318182 9.897972 7.0 7.0 10.0 7.0 8.550000 2.145931 -0.170788 1.0
6 AUT 100.000000 0.0 9.727273 9.974877 NaN NaN NaN NaN NaN NaN -0.343900 NaN
7 AZE 0.000000 0.0 NaN 7.306531 NaN NaN NaN NaN NaN NaN NaN NaN
8 BDI 0.000000 1.0 NaN 6.565265 5.0 1.0 0.0 1.0 280.000000 5.634789 -3.506558 NaN
9 BEL 100.000000 0.0 9.681818 9.992871 NaN NaN NaN NaN NaN NaN -0.179127 NaN
10 BEN 0.000000 1.0 NaN 7.090077 3.0 1.0 0.0 1.0 266.519989 5.585449 -2.830218 NaN
11 BFA 0.000000 1.0 4.454545 6.845880 3.0 1.0 0.0 1.0 280.000000 5.634789 -3.540459 1.0
12 BGD 0.000000 1.0 5.136364 6.877296 7.0 2.0 0.0 1.0 71.410004 4.268438 -2.063568 1.0
13 BGR 100.000000 0.0 8.909091 8.457443 NaN NaN NaN NaN NaN NaN NaN NaN
14 BHR 0.000000 1.0 8.000000 9.685953 NaN NaN NaN NaN NaN NaN NaN NaN
15 BHS 10.000000 1.0 7.500000 9.285448 NaN NaN NaN NaN 85.000000 4.442651 NaN 1.0
16 BIH 100.000000 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
17 BLR 100.000000 0.0 NaN 8.340456 NaN NaN NaN NaN NaN NaN NaN NaN
18 BLZ 20.000000 1.0 NaN 8.377932 NaN NaN NaN 1.0 163.300003 5.095589 NaN NaN
19 BOL 30.000002 1.0 5.636364 7.926602 3.0 7.0 4.0 3.0 71.000000 4.262680 -1.966113 1.0
20 BRA 40.000000 1.0 7.909091 8.727454 1.0 7.0 1.0 3.0 71.000000 4.262680 -1.142564 1.0
21 BRB 20.000000 1.0 NaN 9.266721 NaN NaN NaN 1.0 85.000000 4.442651 -0.906340 NaN
22 BTN 0.000000 1.0 NaN NaN 1.0 2.0 0.0 1.0 NaN NaN NaN NaN
23 BWA 0.000000 1.0 7.727273 8.855093 7.0 7.0 0.0 1.0 NaN NaN -2.364460 NaN
24 CAF 0.000000 1.0 NaN 7.192934 1.0 1.0 0.0 1.0 280.000000 5.634789 -3.411248 NaN
25 CAN 99.000000 1.0 9.727273 9.986449 7.0 7.0 9.0 7.0 16.100000 2.778819 -0.060812 1.0
26 CHE 100.000000 0.0 10.000000 10.120211 NaN NaN NaN NaN NaN NaN -0.134675 NaN
27 CHL 50.000000 1.0 7.818182 9.336092 1.0 7.0 5.0 7.0 68.900002 4.232656 -1.335601 1.0
28 CHN 0.000000 0.0 7.772727 7.886081 NaN NaN NaN NaN 118.000000 4.770685 -2.813411 NaN
29 CIV 0.000000 1.0 7.000000 7.444249 1.0 2.0 0.0 1.0 668.000000 6.504288 -2.333044 1.0
... ... ... ... ... ... ... ... ... ... ... ... ... ...
133 STP 100.000000 1.0 8.409091 NaN NaN NaN NaN NaN NaN NaN NaN NaN
134 SUR 1.000000 NaN 4.681818 8.010000 NaN NaN NaN 1.0 32.180000 3.471345 -1.370421 NaN
135 SVK 100.000000 0.0 NaN 8.852236 NaN NaN NaN NaN NaN NaN NaN NaN
136 SVN NaN 0.0 NaN 9.300181 NaN NaN NaN NaN NaN NaN NaN NaN
137 SWE 100.000000 0.0 9.500000 9.866304 NaN NaN NaN NaN NaN NaN -0.239527 NaN
138 SWZ 0.000000 1.0 NaN 8.107720 2.0 1.0 0.0 1.0 NaN NaN -1.807889 NaN
139 SYR 0.000000 1.0 5.818182 8.029433 NaN NaN NaN NaN NaN NaN -0.825536 NaN
140 TCD 0.000000 1.0 NaN 6.835185 1.0 1.0 0.0 1.0 280.000000 5.634789 -3.442019 NaN
141 TGO 0.000000 1.0 6.909091 7.222566 3.0 1.0 0.0 1.0 668.000000 6.504288 -3.218876 1.0
142 THA 0.000000 0.0 7.636364 8.774931 NaN NaN NaN NaN 140.000000 4.941642 -1.851509 NaN
143 TJK NaN 0.0 NaN 6.887553 NaN NaN NaN NaN NaN NaN NaN NaN
144 TKM NaN 0.0 NaN 7.640123 NaN NaN NaN NaN NaN NaN NaN NaN
145 TTO 40.000000 1.0 7.454545 8.768730 7.0 7.0 0.0 1.0 85.000000 4.442651 -0.697155 1.0
146 TUN 3.000000 1.0 6.454545 8.482602 1.0 3.0 0.0 1.0 63.000000 4.143135 -1.527858 1.0
147 TUR 0.000000 0.0 7.454545 8.641179 NaN NaN NaN NaN NaN NaN -1.523260 NaN
148 TWN 0.000000 0.0 9.227273 NaN NaN NaN NaN NaN NaN NaN -0.809681 NaN
149 TZA 0.000000 1.0 6.636364 6.253829 3.0 3.0 0.0 1.0 145.000000 5.634789 -3.442019 1.0
150 UGA 0.000000 1.0 4.454545 6.966024 7.0 3.0 0.0 1.0 280.000000 5.634789 -3.442019 1.0
151 UKR 100.000000 0.0 NaN 7.811974 NaN NaN NaN NaN NaN NaN NaN NaN
152 URY 60.000004 1.0 7.000000 9.031214 1.0 3.0 1.0 1.0 71.000000 4.262680 -1.078810 1.0
153 USA 87.500000 1.0 10.000000 10.215740 7.0 7.0 10.0 7.0 15.000000 2.708050 0.000000 1.0
154 UZB NaN 0.0 NaN 7.807917 NaN NaN NaN NaN NaN NaN NaN NaN
155 VEN 20.000000 1.0 7.136364 9.071078 1.0 3.0 1.0 3.0 78.099998 4.357990 -0.703197 1.0
156 VNM 0.000000 1.0 6.409091 7.279319 1.0 3.0 0.0 1.0 140.000000 4.941642 NaN 1.0
157 YEM 0.000000 1.0 6.363636 6.646390 NaN NaN NaN NaN NaN NaN -1.551169 NaN
158 YUG 100.000000 0.0 6.318182 NaN NaN NaN NaN NaN NaN NaN -1.203973 NaN
159 ZAF 22.000000 1.0 6.863636 8.885994 3.0 7.0 3.0 3.0 15.500000 2.740840 -1.386294 1.0
160 ZAR 8.000000 1.0 3.500000 6.866933 1.0 1.0 0.0 1.0 240.000000 5.480639 -3.411248 1.0
161 ZMB 3.000000 1.0 6.636364 6.813445 3.0 1.0 0.0 1.0 NaN NaN -2.975930 NaN
162 ZWE 7.200000 1.0 6.000000 7.696213 7.0 3.0 0.0 1.0 NaN NaN -2.733368 NaN

163 rows × 13 columns

In [5]:
import pylab
s = pd.Series(Data_descriptive_stats.avexpr)
p = s.plot(kind='hist', color='orange')
pylab.rc("axes", linewidth=1.0)
pylab.rc("lines", markeredgewidth=1.0) 
pylab.xticks(fontsize=10)
pylab.yticks(fontsize=10)
plt.xlabel ('Avg. Protection against Expropriation Risk', fontsize=11)
plt.ylabel ('Frequency', fontsize=11)
plt.title ('Quality of Institutions', fontsize=12)
plt.show()
In [6]:
s = pd.Series(Data_descriptive_stats.logpgp95)
p = s.plot(kind='hist', color='orange')
pylab.rc("axes", linewidth=1.0)
pylab.rc("lines", markeredgewidth=1.0) 
pylab.xticks(fontsize=10)
pylab.yticks(fontsize=10)
plt.xlabel ('Log PPP GDP', fontsize=11)
plt.ylabel ('Frequency', fontsize=11)
plt.title ('GDP Distribution in 1995 (World Bank)', fontsize=12)
plt.show()

OLS Regression

Before applying the instrumental variable approach, the log per capita income is first regressed on the quality of institutions represented by the protection against expropriation variable in a ordinary least-squares regression. Additionally, we control for several factors such as geographical region and conditions. In mathematical terms, we evaluate $$ log~y_{i} = \mu + \alpha R_{i} + X^{'}_{i} \gamma + \epsilon_{i} $$ where:

The results on our coefficient of interest, $ \alpha $, indicate that there is, in fact, a positive and statistically significant correlation between institutional quality and GDP. However, we cannot infer any causal relationship from this result: are rich countries the ones with better institutions or are good institutions the key to a higher per capita GDP? Moreover, geographical characteristics seem to influence the GDP as well. For instance, the regression results show that the GDP of a country is lower when said country is located in Africa or Asia.

In [7]:
Ols_data = pd.read_stata ('OLS.dta')
In [8]:
Ols_data
Out[8]:
shortnam africa lat_abst avexpr logpgp95 other asia loghjypl baseco
0 AFG 0.0 0.366667 NaN NaN 0.0 1.0 NaN NaN
1 AGO 1.0 0.136667 5.363636 7.770645 0.0 0.0 -3.411248 1.0
2 ARE 0.0 0.266667 7.181818 9.804219 0.0 1.0 NaN NaN
3 ARG 0.0 0.377778 6.386364 9.133459 0.0 0.0 -0.872274 1.0
4 ARM 0.0 0.444444 NaN 7.682482 0.0 1.0 NaN NaN
5 AUS 0.0 0.300000 9.318182 9.897972 1.0 0.0 -0.170788 1.0
6 AUT 0.0 0.524444 9.727273 9.974877 0.0 0.0 -0.343900 NaN
7 AZE 0.0 0.447778 NaN 7.306531 0.0 1.0 NaN NaN
8 BDI 1.0 0.036667 NaN 6.565265 0.0 0.0 -3.506558 NaN
9 BEL 0.0 0.561111 9.681818 9.992871 0.0 0.0 -0.179127 NaN
10 BEN 1.0 0.103333 NaN 7.090077 0.0 0.0 -2.830218 NaN
11 BFA 1.0 0.144444 4.454545 6.845880 0.0 0.0 -3.540459 1.0
12 BGD 0.0 0.266667 5.136364 6.877296 0.0 1.0 -2.063568 1.0
13 BGR 0.0 0.477778 8.909091 8.457443 0.0 0.0 NaN NaN
14 BHR 0.0 0.288889 8.000000 9.685953 0.0 1.0 NaN NaN
15 BHS 0.0 0.268333 7.500000 9.285448 0.0 0.0 NaN 1.0
16 BIH 0.0 0.488889 NaN NaN 0.0 0.0 NaN NaN
17 BLR 0.0 0.588889 NaN 8.340456 0.0 0.0 NaN NaN
18 BLZ 0.0 0.190556 NaN 8.377932 0.0 0.0 NaN NaN
19 BOL 0.0 0.188889 5.636364 7.926602 0.0 0.0 -1.966113 1.0
20 BRA 0.0 0.111111 7.909091 8.727454 0.0 0.0 -1.142564 1.0
21 BRB 0.0 0.145556 NaN 9.266721 0.0 0.0 -0.906340 NaN
22 BTN 0.0 0.303333 NaN NaN 0.0 0.0 NaN NaN
23 BWA 1.0 0.244444 7.727273 8.855093 0.0 0.0 -2.364460 NaN
24 CAF 1.0 0.077778 NaN 7.192934 0.0 0.0 -3.411248 NaN
25 CAN 0.0 0.666667 9.727273 9.986449 0.0 0.0 -0.060812 1.0
26 CHE 0.0 0.522222 10.000000 10.120211 0.0 0.0 -0.134675 NaN
27 CHL 0.0 0.333333 7.818182 9.336092 0.0 0.0 -1.335601 1.0
28 CHN 0.0 0.388889 7.772727 7.886081 0.0 1.0 -2.813411 NaN
29 CIV 1.0 0.088889 7.000000 7.444249 0.0 0.0 -2.333044 1.0
... ... ... ... ... ... ... ... ... ...
133 STP 0.0 0.011111 8.409091 NaN 0.0 0.0 NaN NaN
134 SUR 0.0 0.044444 4.681818 8.010000 0.0 0.0 -1.370421 NaN
135 SVK 0.0 0.537778 NaN 8.852236 0.0 0.0 NaN NaN
136 SVN 0.0 0.511111 NaN 9.300181 0.0 0.0 NaN NaN
137 SWE 0.0 0.688889 9.500000 9.866304 0.0 0.0 -0.239527 NaN
138 SWZ 1.0 0.292222 NaN 8.107720 0.0 0.0 -1.807889 NaN
139 SYR 0.0 0.388889 5.818182 8.029433 0.0 1.0 -0.825536 NaN
140 TCD 1.0 0.166667 NaN 6.835185 0.0 0.0 -3.442019 NaN
141 TGO 1.0 0.088889 6.909091 7.222566 0.0 0.0 -3.218876 1.0
142 THA 0.0 0.166667 7.636364 8.774931 0.0 1.0 -1.851509 NaN
143 TJK 0.0 0.433333 NaN 6.887553 0.0 1.0 NaN NaN
144 TKM 0.0 0.444444 NaN 7.640123 0.0 1.0 NaN NaN
145 TTO 0.0 0.122222 7.454545 8.768730 0.0 0.0 -0.697155 1.0
146 TUN 1.0 0.377778 6.454545 8.482602 0.0 0.0 -1.527858 1.0
147 TUR 0.0 0.433333 7.454545 8.641179 0.0 1.0 -1.523260 NaN
148 TWN 0.0 NaN 9.227273 NaN 0.0 1.0 -0.809681 NaN
149 TZA 1.0 0.066667 6.636364 6.253829 0.0 0.0 -3.442019 1.0
150 UGA 1.0 0.011111 4.454545 6.966024 0.0 0.0 -3.442019 1.0
151 UKR 0.0 0.544444 NaN 7.811974 0.0 0.0 NaN NaN
152 URY 0.0 0.366667 7.000000 9.031214 0.0 0.0 -1.078810 1.0
153 USA 0.0 0.422222 10.000000 10.215740 0.0 0.0 0.000000 1.0
154 UZB 0.0 0.455556 NaN 7.807917 0.0 1.0 NaN NaN
155 VEN 0.0 0.088889 7.136364 9.071078 0.0 0.0 -0.703197 1.0
156 VNM 0.0 0.177778 6.409091 7.279319 0.0 1.0 NaN 1.0
157 YEM 0.0 0.166667 6.363636 6.646390 0.0 1.0 -1.551169 NaN
158 YUG 0.0 0.488889 6.318182 NaN 0.0 0.0 -1.203973 NaN
159 ZAF 1.0 0.322222 6.863636 8.885994 0.0 0.0 -1.386294 1.0
160 ZAR 1.0 0.000000 3.500000 6.866933 0.0 0.0 -3.411248 1.0
161 ZMB 1.0 0.166667 6.636364 6.813445 0.0 0.0 -2.975930 NaN
162 ZWE 1.0 0.222222 6.000000 7.696213 0.0 0.0 -2.733368 NaN

163 rows × 9 columns

In [9]:
import statsmodels.api as sm
import statsmodels.formula.api as smf
In [10]:
results = smf.ols('logpgp95 ~ avexpr + africa + asia + lat_abst + other', data=Ols_data).fit()
In [11]:
results.summary()
Out[11]:
OLS Regression Results
Dep. Variable: logpgp95 R-squared: 0.715
Model: OLS Adj. R-squared: 0.702
Method: Least Squares F-statistic: 52.74
Date: Tue, 31 Jan 2017 Prob (F-statistic): 4.18e-27
Time: 18:22:16 Log-Likelihood: -102.45
No. Observations: 111 AIC: 216.9
Df Residuals: 105 BIC: 233.2
Df Model: 5
Covariance Type: nonrobust
coef std err t P>|t| [95.0% Conf. Int.]
Intercept 5.8511 0.340 17.230 0.000 5.178 6.524
avexpr 0.3896 0.051 7.691 0.000 0.289 0.490
africa -0.9164 0.166 -5.511 0.000 -1.246 -0.587
asia -0.1531 0.155 -0.989 0.325 -0.460 0.154
lat_abst 0.3326 0.445 0.747 0.457 -0.551 1.216
other 0.3035 0.375 0.810 0.420 -0.440 1.047
Omnibus: 4.342 Durbin-Watson: 1.865
Prob(Omnibus): 0.114 Jarque-Bera (JB): 3.936
Skew: -0.457 Prob(JB): 0.140
Kurtosis: 3.126 Cond. No. 58.2

First-stage

Now, we turn to the analytical method that promises to yield valid insights regarding the research question. In the first stage of the instrumental variable (IV) analysis, the treatment variable is regressed on the instrumental variable. Here, the selected instrumental variable are the mortality rates faced by European settlers in colonial times since they are easy to measure and arguably exogenous - a necessary condition in the isolation of the effect the treatment variable has on the outcome variable. In particular we want to regress $$ R_i = \zeta + \beta log M_i + X^{'}_{i} \delta + u_i $$ where $log M_i$ is the logaritm of European settlers' mortality (logem4) and $u_{i} $ is the random error term. Note that $log M_i$ is the instrument used to predict the average protection against expropriation risk which will be employed in the second stage to calculate our coefficient of interest, $\alpha$. When regressing the institution quality on the colonists' mortality rates between the XVII and XIX century, we obtain a strong first stage, meaning that there is a significantly strong negative correlation between the two variables (taken as logs). In particular, a higher settler mortality rate is correlated with a lower institution quality. As a result, the pattern in the relationship between the two factors seems to be following the path theorised by Acemoglu, Johnson and Robinson:
N. Correlation
1. potential settler mortality rates major determinant in settlements
2. settlements major determinant in institutions in 1900
3. institutions in 1900 major determinant in institutions today
In [12]:
Data_first_stage = pd.read_stata ('1st-stage.dta')
In [13]:
Data_first_stage 
Out[13]:
lat_abst euro1900 excolony avexpr logpgp95 cons1 indtime democ00a cons00a extmort4 logem4
0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
2 NaN 28.000000 0.0 5.000000 NaN 3.0 154.0 1.0 3.0 78.099998 4.357990
3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
5 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
6 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
7 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
8 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
9 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
10 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
11 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
12 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
13 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
14 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
15 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
16 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
17 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
18 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
19 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
20 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
21 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
22 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
23 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
24 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
25 NaN NaN NaN 4.431818 NaN NaN NaN NaN NaN NaN NaN
26 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
27 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
28 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
29 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
... ... ... ... ... ... ... ... ... ... ... ...
346 NaN 0.000000 0.0 9.227273 NaN NaN NaN NaN NaN NaN NaN
347 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
348 0.066667 0.000000 1.0 6.636364 6.253829 3.0 33.0 0.0 1.0 145.000000 5.634789
349 0.011111 0.000000 1.0 4.454545 6.966024 7.0 33.0 0.0 1.0 280.000000 5.634789
350 0.544444 100.000000 0.0 NaN 7.811974 NaN NaN NaN NaN NaN NaN
351 0.366667 60.000004 1.0 7.000000 9.031214 1.0 165.0 1.0 1.0 71.000000 4.262680
352 0.422222 87.500000 1.0 10.000000 10.215740 7.0 195.0 10.0 7.0 15.000000 2.708050
353 0.455556 NaN 0.0 NaN 7.807917 NaN NaN NaN NaN NaN NaN
354 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
355 0.146100 10.000000 1.0 NaN 8.318742 NaN NaN NaN NaN 85.000000 NaN
356 0.088889 20.000000 1.0 7.136364 9.071078 1.0 165.0 1.0 3.0 78.099998 4.357990
357 NaN NaN 0.0 NaN NaN NaN NaN NaN NaN NaN NaN
358 0.177778 0.000000 1.0 6.409091 7.279319 1.0 41.0 0.0 1.0 140.000000 4.941642
359 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
360 NaN NaN 0.0 NaN 8.140316 NaN NaN NaN NaN NaN NaN
361 NaN NaN 0.0 NaN NaN NaN NaN NaN NaN NaN NaN
362 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
363 NaN NaN 0.0 NaN NaN NaN NaN NaN NaN NaN NaN
364 NaN NaN 0.0 NaN 7.867105 NaN NaN NaN NaN NaN NaN
365 0.166667 0.000000 1.0 6.363636 6.646390 NaN NaN NaN NaN NaN NaN
366 0.488889 100.000000 0.0 6.318182 NaN NaN NaN NaN NaN NaN NaN
367 0.488889 100.000000 0.0 6.318182 NaN NaN NaN NaN NaN NaN NaN
368 0.322222 22.000000 1.0 6.863636 8.885994 3.0 139.0 3.0 3.0 15.500000 2.740840
369 0.000000 8.000000 1.0 3.500000 6.866933 1.0 35.0 0.0 1.0 240.000000 5.480639
370 0.166667 3.000000 1.0 6.636364 6.813445 3.0 31.0 0.0 1.0 NaN NaN
371 0.222222 7.200000 1.0 6.000000 7.696213 7.0 72.0 0.0 1.0 NaN NaN
372 0.222222 7.200000 1.0 NaN NaN 7.0 72.0 0.0 1.0 NaN NaN
373 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
374 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
375 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

376 rows × 11 columns

In [14]:
First_stage = smf.ols('avexpr ~ logem4', data=Data_first_stage).fit()
In [15]:
First_stage.summary()
Out[15]:
OLS Regression Results
Dep. Variable: avexpr R-squared: 0.298
Model: OLS Adj. R-squared: 0.288
Method: Least Squares F-statistic: 30.99
Date: Tue, 31 Jan 2017 Prob (F-statistic): 4.08e-07
Time: 18:22:17 Log-Likelihood: -126.60
No. Observations: 75 AIC: 257.2
Df Residuals: 73 BIC: 261.8
Df Model: 1
Covariance Type: nonrobust
coef std err t P>|t| [95.0% Conf. Int.]
Intercept 9.4925 0.550 17.263 0.000 8.397 10.588
logem4 -0.6441 0.116 -5.567 0.000 -0.875 -0.413
Omnibus: 0.907 Durbin-Watson: 1.806
Prob(Omnibus): 0.635 Jarque-Bera (JB): 0.873
Skew: -0.021 Prob(JB): 0.646
Kurtosis: 2.473 Cond. No. 17.8

Exclusion Restriction

In order for the instrumental variable regression method to be valid, the so-called exclusion restriction has to be fulfilled. Said condition requires the instrumental variable to influence the outcome of interest only via the treatment variable. It is fairly straightforward that the colonists' expected mortality rates between the seventeenth and nineteenth century (our instrument) are in no way directly related to nowadays economic performance, the outcome variable, in each specific country. In fact, the only concern that might arise relates to current diseases: in such case, the instrument's effect on GDP might be a mere reflection of the disease environment and not a consequence of the country's quality of institutions. However, it has been shown that the colonists' mortality rates were maily due to their lack of immunity against diseases such as malaria and yellow fever, immunity that had been developed by the indigenous population over the centuries. It is therefore unlikely that such diseases take a central role in causing some former colonies to be extremely poor. Hence,

"The advantage of our approach is that conditional on the variables that we already control for, settler mortality more than 100 years ago should have no effect on output today, other than through its effect on institutions."
Acemoglu et al., 2001

Second-stage

In the second stage of the IV regression analysis, the impact of institutions on income per capita is estimated. Since it is not possible to reproduce the instrumental variable regression command that the authors apply with the python software, we conduct the second-stage analysis using the OLS regression as before. In the original paper, Acemoglu et al. replicate the second stage of the analysis with the ordinary least-squares method as well as shown in Table 5. In particular, we estimate the following equation $$ log~y_{i} = \mu + \alpha \hat{R}_{i} + X^{'}_{i}\gamma + \epsilon_{i} $$ where The regression results match the ones of the paper and show a highly significant positive correlation between the type and quality of institutions and the level of income in the investigated countries (0.4761). It can be seen that the effect is even larger than estimated in the predictive OLS regression above (0.3896). These findings support the initial expectations in which more extensive and higher quality institutions lead to an overall better economic performance.
In [16]:
Data_IV_reg = pd.read_stata ('IV-REG-additional-controls.dta')
In [17]:
Data_IV_reg
Out[17]:
shortnam catho80 muslim80 lat_abst no_cpm80 f_brit f_french avexpr sjlofr logpgp95 logem4 baseco
0 AFG 0.000000 99.300003 0.366667 0.699997 1.0 0.0 NaN 1.0 NaN 4.540098 NaN
1 AGO 68.699997 0.000000 0.136667 11.500004 0.0 0.0 5.363636 1.0 7.770645 5.634789 1.0
2 ARE 0.400000 94.900002 0.266667 4.399999 1.0 0.0 7.181818 0.0 9.804219 NaN NaN
3 ARG 91.599998 0.200000 0.377778 5.500001 0.0 0.0 6.386364 1.0 9.133459 4.232656 1.0
4 ARM 0.000000 0.000000 0.444444 100.000000 0.0 0.0 NaN 0.0 7.682482 NaN NaN
5 AUS 29.600000 0.200000 0.300000 46.700001 1.0 0.0 9.318182 0.0 9.897972 2.145931 1.0
6 AUT 88.800003 0.600000 0.524444 4.099997 0.0 0.0 9.727273 0.0 9.974877 NaN NaN
7 AZE 0.000000 93.400002 0.447778 6.599998 0.0 0.0 NaN 0.0 7.306531 NaN NaN
8 BDI 78.300003 0.900000 0.036667 15.899997 0.0 0.0 NaN 1.0 6.565265 5.634789 NaN
9 BEL 90.000000 1.100000 0.561111 8.500000 0.0 0.0 9.681818 1.0 9.992871 NaN NaN
10 BEN 18.500000 15.200000 0.103333 63.500000 0.0 1.0 NaN 1.0 7.090077 5.585449 NaN
11 BFA 9.000000 43.000000 0.144444 46.400002 0.0 1.0 4.454545 1.0 6.845880 5.634789 1.0
12 BGD 0.200000 85.900002 0.266667 13.699999 1.0 0.0 5.136364 0.0 6.877296 4.268438 1.0
13 BGR 0.500000 10.600000 0.477778 88.500000 0.0 0.0 8.909091 0.0 8.457443 NaN NaN
14 BHR 0.800000 95.000000 0.288889 3.300000 1.0 0.0 8.000000 0.0 9.685953 NaN NaN
15 BHS 25.500000 0.000000 0.268333 27.300003 1.0 0.0 7.500000 0.0 9.285448 4.442651 1.0
16 BIH 15.000000 40.000000 0.488889 41.000000 0.0 0.0 NaN 0.0 NaN NaN NaN
17 BLR 14.000000 0.000000 0.588889 86.000000 0.0 0.0 NaN 0.0 8.340456 NaN NaN
18 BLZ 66.800003 0.000000 0.190556 19.999996 1.0 0.0 NaN 0.0 8.377932 5.095589 NaN
19 BOL 92.500000 0.000000 0.188889 5.200000 0.0 0.0 5.636364 1.0 7.926602 4.262680 1.0
20 BRA 87.800003 0.100000 0.111111 8.099997 0.0 0.0 7.909091 1.0 8.727454 4.262680 1.0
21 BRB 5.900000 0.200000 0.145556 60.700001 1.0 0.0 NaN 0.0 9.266721 4.442651 NaN
22 BTN 0.000000 5.000000 0.303333 95.000000 1.0 0.0 NaN 0.0 NaN NaN NaN
23 BWA 9.400000 0.000000 0.244444 63.799999 1.0 0.0 7.727273 0.0 8.855093 NaN NaN
24 CAF 33.099998 3.200000 0.077778 13.700002 0.0 1.0 NaN 1.0 7.192934 5.634789 NaN
25 CAN 46.599998 0.600000 0.666667 23.200001 1.0 0.0 9.727273 0.0 9.986449 2.778819 1.0
26 CHE 52.799999 0.300000 0.522222 3.700000 0.0 0.0 10.000000 0.0 10.120211 NaN NaN
27 CHL 82.099998 0.000000 0.333333 16.000002 0.0 0.0 7.818182 1.0 9.336092 4.232656 1.0
28 CHN 0.000000 2.400000 0.388889 97.599998 0.0 0.0 7.772727 0.0 7.886081 4.770685 NaN
29 CIV 18.500000 24.000000 0.088889 52.799999 0.0 1.0 7.000000 1.0 7.444249 6.504288 1.0
... ... ... ... ... ... ... ... ... ... ... ... ...
133 STP 92.400002 0.000000 0.011111 5.399999 0.0 0.0 8.409091 1.0 NaN NaN NaN
134 SUR 36.000000 13.000000 0.044444 14.400002 0.0 0.0 4.681818 1.0 8.010000 3.471345 NaN
135 SVK 74.000000 0.000000 0.537778 17.600000 0.0 0.0 NaN 0.0 8.852236 NaN NaN
136 SVN 71.400002 1.500000 0.511111 27.099998 0.0 0.0 NaN 0.0 9.300181 NaN NaN
137 SWE 1.400000 0.100000 0.688889 30.099998 0.0 0.0 9.500000 0.0 9.866304 NaN NaN
138 SWZ 10.800000 0.100000 0.292222 55.199997 1.0 0.0 NaN 0.0 8.107720 NaN NaN
139 SYR 1.300000 89.599998 0.388889 8.900002 0.0 1.0 5.818182 1.0 8.029433 NaN NaN
140 TCD 21.000000 44.000000 0.166667 23.400000 0.0 1.0 NaN 1.0 6.835185 5.634789 NaN
141 TGO 29.299999 17.000000 0.088889 47.599998 0.0 1.0 6.909091 1.0 7.222566 6.504288 1.0
142 THA 0.400000 3.900000 0.166667 95.500000 0.0 0.0 7.636364 0.0 8.774931 4.941642 NaN
143 TJK 0.000000 85.000000 0.433333 15.000000 0.0 0.0 NaN 0.0 6.887553 NaN NaN
144 TKM 0.000000 87.000000 0.444444 13.000000 0.0 0.0 NaN 0.0 7.640123 NaN NaN
145 TTO 35.799999 6.500000 0.122222 44.500000 1.0 0.0 7.454545 0.0 8.768730 4.442651 1.0
146 TUN 0.100000 99.400002 0.377778 0.499998 0.0 1.0 6.454545 1.0 8.482602 4.143135 1.0
147 TUR 0.100000 99.199997 0.433333 0.700003 0.0 0.0 7.454545 1.0 8.641179 NaN NaN
148 TWN NaN NaN NaN NaN NaN NaN 9.227273 NaN NaN NaN NaN
149 TZA 28.200001 32.500000 0.066667 28.099998 0.0 0.0 6.636364 0.0 6.253829 5.634789 1.0
150 UGA 49.599998 6.600000 0.011111 41.900002 1.0 0.0 4.454545 0.0 6.966024 5.634789 1.0
151 UKR 0.000000 0.000000 0.544444 100.000000 0.0 0.0 NaN 0.0 7.811974 NaN NaN
152 URY 59.500000 0.000000 0.366667 38.599998 0.0 0.0 7.000000 1.0 9.031214 4.262680 1.0
153 USA 30.000000 0.800000 0.422222 25.600002 1.0 0.0 10.000000 0.0 10.215740 2.708050 1.0
154 UZB 0.000000 88.000000 0.455556 12.000000 0.0 0.0 NaN 0.0 7.807917 NaN NaN
155 VEN 94.800003 0.000000 0.088889 4.199997 0.0 0.0 7.136364 1.0 9.071078 4.357990 1.0
156 VNM 3.900000 1.000000 0.177778 94.900002 0.0 1.0 6.409091 1.0 7.279319 4.941642 1.0
157 YEM 0.000000 99.500000 0.166667 0.400000 0.0 0.0 6.363636 1.0 6.646390 NaN NaN
158 YUG 4.000000 19.000000 0.488889 76.000000 0.0 0.0 6.318182 0.0 NaN NaN NaN
159 ZAF 10.400000 1.300000 0.322222 49.299999 1.0 0.0 6.863636 0.0 8.885994 2.740840 1.0
160 ZAR 48.400002 1.400000 0.000000 21.199999 0.0 0.0 3.500000 1.0 6.866933 5.480639 1.0
161 ZMB 26.200001 0.300000 0.166667 41.599998 1.0 0.0 6.636364 0.0 6.813445 NaN NaN
162 ZWE 14.400000 0.900000 0.222222 63.299999 1.0 0.0 6.000000 0.0 7.696213 NaN NaN

163 rows × 12 columns

In [18]:
Second_stage = smf.ols('logpgp95 ~ avexpr + lat_abst + f_brit + f_french + sjlofr', data=Data_IV_reg).fit()
In [19]:
Second_stage.summary()
Out[19]:
OLS Regression Results
Dep. Variable: logpgp95 R-squared: 0.670
Model: OLS Adj. R-squared: 0.654
Method: Least Squares F-statistic: 42.55
Date: Tue, 31 Jan 2017 Prob (F-statistic): 9.33e-24
Time: 18:22:22 Log-Likelihood: -110.70
No. Observations: 111 AIC: 233.4
Df Residuals: 105 BIC: 249.7
Df Model: 5
Covariance Type: nonrobust
coef std err t P>|t| [95.0% Conf. Int.]
Intercept 4.2941 0.404 10.628 0.000 3.493 5.095
avexpr 0.4761 0.056 8.498 0.000 0.365 0.587
lat_abst 1.2293 0.487 2.525 0.013 0.264 2.195
f_brit 0.3231 0.179 1.808 0.073 -0.031 0.677
f_french -0.3784 0.202 -1.870 0.064 -0.780 0.023
sjlofr 0.6409 0.176 3.650 0.000 0.293 0.989
Omnibus: 8.072 Durbin-Watson: 1.685
Prob(Omnibus): 0.018 Jarque-Bera (JB): 7.916
Skew: -0.643 Prob(JB): 0.0191
Kurtosis: 3.239 Cond. No. 59.3

Conclusion

In the presented work, we reproduced the findings of Acemoglu et al. concerning the interdependence between institutions and economic performance of former European colonies. In line with the methodoloy of the original paper, we utilized the two-stage instrumental variable approach in order to isolate exogenous sources of variation in institutions to assess their impact on income per capita. This procedure is crucial in establishing a causal relationship between the two investigated variables of interest, which, in turn, allows for the inference of valid outcomes. The results we obtained in the regression analyses state a large and statistically significant effect of institutional structure on economic performance measured by GDP. Naturally, there are many questions that presented paper does not cover and which could be studied in further scientific research. However, the given findings present valuable insights into the determinants of a country's economic efficiency and therefore wealth.


Tables