Employing Data Mining Techniques in Testing the Effectiveness of Modernization Theory

This interdisciplinary study is concerned with testing the effectiveness of Modernization Theory in explaining regime change by means of data mining techniques. Modernization Theory, which links democratization with economic development (improvements in income, urbanization, industrialization, education and communication levels), has been criticized widely. Many criticisms posited that there is not a significant relation between economic development and democratization. This study is an attempt to test whether the theory has improved its effectiveness with the advent of the Internet and mobile phone technologies. To this end, first, the variables are introduced. Then, the study makes an analysis by using data mining techniques. It first tests the correlation between democratization and improvements in income, education, urbanization and communication levels within the period between 1976 and 1995. Then it adds the new variables, the Internet and mobile phone usage, and tests the correlation between democratization and this new range of variables for 1996-2015 period. In the conclusion, the study evaluates whether the effectiveness of Modernization Theory is improved when the Internet and mobile phone usage are added as the new variables. It is found that there is not a strong relation between income per capita and democratization as some critics of the Modernization Theory suggest, but other factors emphasized by this theory like improvements in education and communication have a more decisive effect. Moreover, among our new variables, Internet usage proved to be a really important variable conducive to democratization according to test results.


Introduction
The self-immolation of Mohammed Bouazizi has sent shock waves through all the Middle East and even beyond.The incident has been considered to spark the events to be called as the 'Arab Spring'2 later.Bouazizi was a street vendor in Tunisia and set himself on fire on 17 December 2010 to protest confiscation of his material by municipal officials and mistreatment by the police and the municipality (Lageman 2016).He became a symbol for masses in the region, which fed up with corruption, unemployment and mistreatment.The demonstrations in Tunisia led to the resignation of then-president, Zine El Abidine Ben Ali (Ryan 2011) and spread to other countries in the region soon.
Since its start, Arab Spring events changed the lives of millions of people in the region in various ways.Moreover, it changed the international balances and led the powers interested in the region to reformulate their policies.It has affected the academia as well.Academics had hard times in explaining the events, which were quite unexpected for them.The start of Arab Spring was especially puzzling for those abiding by the Modernization Theory to account for political change.Whereas Modernization Theory linked political development with economic development, what triggered the events in Tunisia and led to the democratization in the country eventually was economic hardship instead of economic development.
As a result, Modernization Theory has become the target of criticisms increasingly as will be discussed in the following section.However, one has to bear in mind that Modernization Theory does not only link political development with economic development but also directs attention to the relation between democratization and improved levels of education, urbanization and communication.Therefore, if one takes into account that social media and mobile phones played a critical role in drawing people to the streets to protest, Modernization Theory appears as an approach whose real strength hasn't been realized.
In this study, Modernization Theory will be examined in a broad and experimental perspective and its effectiveness will be tested thoroughly.To this end, first the main premises of the theory will be discussed and the main variables that the theory uses to explain political change will be addressed.Second, the main criticisms to the theory and the context that these criticisms emerged will be examined.Third, the study moves to making an empirical study by using data mining techniques.It first tests the correlation between democratization and improvements in income, education, urbanization and communication levels within the period between 1976 and 1995.Then it adds the new variables, the Internet and mobile phone usage, and tests the correlation between democratization and this new range of variables for 1996-2015 period.In the conclusion, the study evaluates whether the effectiveness of Modernization Theory is improved when the Internet and mobile phone usage are added as the new variables.

Modernization Theory and Its Discontents
Modernization can be defined as a process through which economic and technological change lead to the transformation of institutions and values of a society (Augustinos 1991, 2).It is a process through which less developed societies obtain attributes of common to more developed societies (Lerner 1968, 386).The theory linking this economic and technological change to democratization is called Modernization Theory.Lipset and Lerner, basing their claims on the studies of Herbert Spencer, Karl Marx, Max Weber, Emile Durkheim and Talcott Parsons, pioneered the studies focusing on this link (Kennedy 2010, 785 and Schmidt 2010, 513).
Lipset's seminal article 'Some Social Requisites of Democracy: Economic Development and Political Legitimacy' is a good point to start a discussion on the premises of Modernization Theory.In this article, Lipset argues that there is a link between economic development and democracy in the sense that "the more well-to-do a nation, the greater the chances that it will sustain democracy" (Lipset 1959, 75).In his understanding, economic development comprises wealth, education, urbanization and industrialization.It is necessary to state that with wealth, he does not only mean per capita in a country.He also includes radios, telephones and newspapers per person in his criteria for economic development.Besides wealth, he focuses on industrialization, urbanization and education.(Lipset 1959, 75).As indices of industrialization, he focuses on percentage of males in agriculture and per capita energy consumed.For education, his variables are percentage of literate, primary education enrollment per 1,000 persons and higher education enrollment per 1,000 persons and his indices for urbanization are percentage of population in metropolitan areas, cities over 20.000 and 100.000 (Lipset 1959, 76, 77).
It is necessary to state that in his seminal article, Lipset was largely inspired by Lerner.One year before Lipset's article, Lerner introduced urbanization, education and communication (media) as essential factors in the process of individual modernization and political participation (Wucherpfennig and Deutsch 2009, 2).It was Lipset who carried out an empirical study by focusing on these indices and found out that whereas economically developed countries of Western Europe together with US and Canada have democratic systems, less developed countries of Latin and Eastern Europe, Latin America and then newly independent Asia and Africa lack such systems (Lipset 1959).He also discussed his thesis in a more comprehensive way in his book he wrote one year later, The Social Bases of Politics.
Lipset also argues that large income gap is a hurdle for democracy.He states that when the gap is huge, the upper classes tend to treat the lower classes as inferior.Under these conditions, they do not regard giving the lower classes political rights as necessary; such an action becomes absurd for them (Lipset 1959, 83-84).He also argues that increased wealth changes the social conditions of the working class.When they have increased income, greater economic security and higher education, workers are inclined to develop longer time perspectives and gradualist views of politics rather than extremist ones (Lipset 1959, 83).He emphasizes the role of middle class in mediating the conflict between upper and lower class.He does not carry out an empirical study for testing the relation between class structure of the society and democracy, but it is clearly seen that income distribution is a significant factor for him in evaluating the chances for democracy.
It is necessary to emphasize that Lipset does not argue that economic wealth brings about democratization automatically.He focuses on changes in the society brought by increased wealth.In his thesis, it is through these channels that democracy makes inroad into authoritarian countries.These are improvements in education, income division, urbanization and communication.As he argues, these changes will make the society more likely to embrace political tolerance, selection based on competence and performance without favoritism (Lipset 1959, 84).Neither does he think that without increasing wealth, democracy cannot exist.He argues that it is not necessary to be pessimistic when the conditions that the democratic countries of West have lack in other countries.When these conditions lack, some actions of people can shape institutions and trajectory of events in directions that increase or decrease the chance of democracy to develop and survive (Lipset 1959, 103).Therefore, it can be argued that rather than ruling out other mechanisms for the development and survival of democracy and exclusively focusing on structural factors, Lipset even winked at actor-oriented (procedural) approaches on regime change, which would put emphasis on elites' role in democratization.
Lipset's thesis that there is a link between economic development (and the changes it created in the society) and democracy would become the target of broad criticisms later.However, Modernization Theory became highly popular in1950's and 1960's due to its thesis on developing countries and experienced declining popularity in 1970's and 1980's as a result of criticisms towards it (Martinelli 2004, 1).At the end of 1980's and in 1990's, it went through a revival thanks to several factors.First of all, the collapse of the Soviet Union freed the Modernization Theory from the challenge of a competing theory.In addition to former Soviet Republics, former Eastern bloc members in Europe started to follow the trajectories advised by modernization theorists.China's rapid development at the end of 1980's and 1990's was also regarded and named as modernization within and without.Lastly, young scholars in this era also began to defend the theory against criticisms with a new energy and came up with new conceptual extensions.As a result, Modernization Theory enjoyed a revival at the end of 1980's and in mid-1990's (Marsh 2014, 266, 267).
Famous criticism of the theory by Przeworski and Limongi (1997) proved to be instrumental in bringing the end to this revival.In an attempt to evaluate the theory's degree of success in linking democratization to economic development, they make a distinction between endogenous and exogenous democracy (Przeworski and Limongi 1997, 157).Endogenous democracy puts forward that economic development increases the chances for a country to experience a transition to democracy.Exogenous democracy puts forward that once established, economic development increases the chances of a democracy to survive.After carrying out an empirical study, Przeworski and Limongi found that empirical evidence did not substantiate the thesis of endogenous democracy.The relation between economic development and transition to democracy is insignificant.They argue that democracy is or is not established by political actors pursuing their aims at any level of economic development (Przeworski and Limongi 1997, 177).To the contrary, they point out that their findings strongly confirm the exogenous version of Lipset's theory.Once established, the chances for the survival of democracy are greater when the country is more affluent (Przeworski and Limongi 1997, 166, 177).
Although the criticisms of Przeworski and Limongi had an important impact on the studies on regime change, a close examination reveals that their study suffers from important weaknesses.First, they decide that endogenous democracy has a negligible capacity by only testing the relationship between per capita income and democracy.In his seminal article, Lipset makes a more comprehensive analysis by including certain indices of improvements in education, urbanization and industrialization.It is unfair to arrive at such a conclusion by only focusing on one variable.In this study, we will make a broad analysis by including various indices for education, communication, urbanization and industrialization besides gross national income per capita.Another weakness of their study arises from the fact that they accuse Modernization Theory of being deterministic (Przeworski and Limongi 1997, 176) but as the forerunner of this theory, Lipset does not deserve such a criticism because he argues, as far as the data he had concerned, there seems to be a correlation between economic development and democratization.However, he also states that actors can play critical roles in the trajectories of countries as they can shape rules and institutions.Ryan Kennedy (2010) recently offered a good critic of modernization theory by arguing that whereas economic crises can bring the end of dictators, economic development during their rule increases their legitimacy in the eyes of people they rule and serves to prolong their rule.Therefore, he argues that the relationship between economic development and democratization seems to work in the opposite direction to what Modernization Theory defends (Kennedy 2010, 786).

Empirical Study and Findings
This section is devoted to discussing what we have done to test the relationship between economic development (together with the improvements it brings in education, urbanization, industrialization and communication) and democracy.We tried to find a mathematical relation between the democracy scores of countries and the possible predictors of those scores.Some predictors, such as "Internet users per 100 people" and "Mobile cellular subscriptions per 100 people", had few values for 1976-1995 period.Therefore, we divided the time zone into two pieces, 1976-1995 period and 1996-2015 period.Keeping all the other predictors same, we employed two additional predictors, "Internet users per 100 people" and "Mobile cellular subscriptions per 100 people" for the 1996-2015 period to assess the relation between the democracy scores of the countries and the predictors.The common predictors of democracy scores for both periods are as follows: Literacy rate, adult total (% of people ages 15 and above) Our two data sets (1976-1995 period, 1996-2015 period) were compiled from World Bank Data Bank and Freedom House resources.The Freedom House resource (Freedom House 2016a) was used to obtain democracy scores of 172 countries, worldwide.World Bank Data Bank (World Bank 2016) was used to obtain predictor values of the countries.
According to Freedom House, the countries are labeled as "Free", "Partly Free" and "Not Free" in terms of their "Political Rights" and "Civil Liberties" scores.Political Rights and Civil Liberties are measured on a one-to-seven scale, with one representing the highest degree of freedom and seven the lowest.Until 2003, countries whose combined average ratings for Political Rights and for Civil Liberties fell between 1.0 and 2.5 were designated "Free"; between 3.0 and 5.5 "Partly Free", and between 5.5 and 7.0 "Not Free".Beginning with ratings for 2003, countries whose combined average ratings fell between 3.0 and 5.0 are labeled as "Partly Free", and those between 5.5 and 7.0 are "Not Free".In our study, we decided to employ regression rather than classification.Therefore, combined average ratings ("(Political Rights + Civil Liberties)/2") were used rather than democracy status values ("Free", "Partly Free", "Not Free") (Freedom House 2016b).Regression technique allows us to monitor small changes in the predicted attribute.On the other hand, classification technique categorizes combined average ratings.
Upon construction of the two datasets, we chose Weka Software (Weka 3) data mining tool and Microsoft Office Excel to conduct data analysis.Each data mining process employs a data preprocessing phase and this preprocessing phase includes selection of the significant attributes.So, a supervised attribute filter was used to select significant attributes (predictors) in Weka.This filter is very flexible and allows various search and evaluation methods to be combined.Among the parameters it uses "Evaluator" and "Search" are the most important ones."Evaluator" determines how attributes/attribute subsets are evaluated."Search" determines the search method.In our study, "CfsSubsetEval" and "BestFirst" were selected as the evaluator and search parameters (methods), respectively.CfsSubsetEval evaluates the worth of a subset of attributes by considering the individual predictive ability of each feature along with the degree of redundancy between them.BestFirst searches the space of attribute subsets by greedy hill climbing boosted with a backtracking facility.
The preprocessing phase reduced the number of predictors from 14 to 2 and from 16 to 8 for 1976-1995 and 1996-2015 periods, respectively.The remaining, therefore the most significant, attributes for 1976-1995 period are: School enrollment, primary (% gross) We notice that number of Internet users and mobile cellular subscriptions are among the most significant predictors of democracy scores.
In the second phase of data analysis, we employed multiple linear regression technique in Microsoft Office Excel environment for both of the periods.Figure 1 shows the regression statistics for the 1996-2015 period.The t-test gives the "Population in urban agglomerations of more than 1 million (% of total population)" and the "Internet users per 100 people" predictors as the only statistically significant predictors of the democracy score since their p values are smaller than 0.05.The p-value is defined as the probability of obtaining a result equal to or bigger than what was actually observed, when the null hypothesis is true.The threshold value, also called as significance level of the test, was taken 5% traditionally.The coefficient of "Internet users per 100 people" is -0,035 in the regression equation.This shows that, keeping all the other factors constant, 1 amount of increase in this predictor decreases democracy score by 0,035.This is good, since lower democracy scores indicate a more democratic regime.That is, Internet usage is useful for a more democratic regime.

Figure1. Multiple Regression Statistics for 1996-2015 Period
Although Weka and Excel results do present different significant predictors, internet usage related attribute shows itself in both experiments.One can question the high p value of "Mobile cellular subscriptions per 100 people" predictor in the multiple regression statistics.We think that this is mostly due to the high positive correlation between "Mobile cellular subscriptions per 100 people" and "Internet users per 100 people" predictors.This is stated in Figure 2. The correlation coefficient (Multiple R) is 0.79 between these two attributes.The high correlation may shadow the existence of "Mobile cellular subscriptions per 100 people" predictor in the regression equation.In spite of no-show, it has a negative coefficient of -0.008.This shows that, keeping all the other factors constant, 1 amount of increase in this predictor decreases democracy score by 0,008.That is, mobile cellular subscription is useful for a more democratic regime.
Figure2.Simple Regression Statistics (Dependent Attribute: Internet user per 100 people) Figure 3 shows the regression statistics for the 1976-1995 period.The t-test gives the "Population in urban agglomerations of more than 1 million (% of total population)" , "Fixed telephone subscriptions (per 100 people)", "Employment in industry (% of total employment)" and "Energy use (kg of oil equivalent per capita)" predictors as the only statistically significant predictors of the democracy score since their p values are smaller than 0.05.Although Weka and Excel results do present different significant predictors, "Fixed telephone subscriptions (per 100 people)" attribute shows itself in both experiments.The coefficient of this predictor is -0,074 in the regression equation.This shows that, keeping all the other factors constant, 1 amount of increase in this predictor decreases democracy score by 0,074.The same predictor has a coefficient of -0,018 for the 1996-2015 period.This states that communication related attributes (mobile or fixed) have positive effects towards a more democratic score (Regardless of their p values).
To summarize, usage of mobile/fixed telephones and Internet technologies have a positive effect towards a more democratic world.To the contrary, GNI per capita is not statistically significant in the observed regression equations.Even if it were, GNI per capita predictor has a nearly zero coefficient value in the regression equations of both periods.That is, GNI per capita does not relate much to democracy score of countries.Therefore, with our empirical study, we have showed that although Przeworski and Limongi are right in their argument that there is a negligible relation between income level and democratization, the other variables of economic development have an important relation to democratization.They were right at this point, but their study was limited in scope and for that reason inadequate.In the light of findings of this study, their criticism to Modernization Theory and Lipset seems to be unfair.

Conclusion
This study has focused on the effectiveness of Modernization Theory in testing the relation between economic development and democratization.To this end, it first examined the main premises of theory put forward by Lipset.It was shown that besides improvements in income per capita, Lipset emphasized the importance of variables including improvements in education, urbanization, industrialization and communication.Then the critics of the theory have been examined and it was discussed that whereas Lipset focused on a wide range of variables to account for the relation between economic development, his critics-among them the one by Przeworski and Limongi was the most prominent-focused only on the income per capita.Then, the study carried out a more comprehensive empirical study to test the relation between economic development and democratization in an appropriate way.We focused on GNI per capita, literacy rate, primary, secondary, tertiary school enrollment rates, income distribution, percentage of population in urban agglomerations of more than 1 million, percentage of urban population, employment in industry and energy use as variables for the period 1976-1995.For the period 1996-2015, we added two new variables: Internet users per 100 people and mobile cellular subscriptions per 100 people.
Our test results revealed that usage of mobile/fixed telephones and Internet technologies have a positive effect towards a more democratic world.To the contrary, GNI per capita proved to be not statistically significant in the observed regression equations.Even if it were, GNI per capita predictor has a nearly zero coefficient value in the regression equations of both periods.That is, GNI per capita does not relate much to democracy score of countries.Therefore, with our empirical study, we have showed that although Przeworski and Limongi are right in their argument that there is a negligible relation between income level and democratization, the other variables of economic development have an important relation to democratization.They were right at this point, but their study was limited in scope and for that reason inadequate.As a result, their criticism to Modernization Theory and Lipset seems to be unjust.
Acemoglu et al. (2007) also provided a widely known critic of Modernization Theory.These writers argued that the positive relationship between economic development and democracy is an illusion.Countries become democratic or authoritarian due to critical junctures in history (Acemoglu et al. 2007).Once country-specific variables are included in the analysis of trajectories countries, it is seen that critical historical junctures are the real cause of both economic development and democratization (Acemoglu et al. 2007).
School enrollment, tertiary (% gross) School enrollment, secondary (% gross) School enrollment, primary (% gross) Population in urban agglomerations of more than 1 million (% of total population) Urban population (% of total) Fixed telephone subscriptions (per 100 people) Income share held by highest 20% Income share held by second 20% Income share held by third 20% Income share held by fourth 20% Income share held by lowest 20% GNI per capita Employment in industry (% of total employment) Energy use (kg of oil equivalent per capita)

Fixed
telephone subscriptions (per 100 people) The most significant attributes for 1996-2015 period are: Literacy rate, adult total (% of people ages 15 and above) School enrollment, tertiary (% gross) School enrollment, secondary (% gross) Fixed telephone subscriptions (per 100 people) Income share held by third 20% Income share held by fourth 20% Internet users per 100 people Mobile cellular subscriptions per 100 people