1 Introduction

Modeling the economic consequences of natural hazard-induced disasters has gained importance in recent years. In the Mediterranean region, earthquakes are one of the most critical natural hazards, and they can cause considerable economic and social losses. Coupled with the vulnerability of buildings and high social and economic exposure, earthquakes are a regular and serious threat to communities, particularly in Greece, Italy, Turkey, Morocco, and Algeria.

Between 1900 and 2015, the Mediterranean region experienced thousands of damaging earthquakes, some of which were major disasters, such as the earthquakes of Messina, Italy, in 1908 (Barbano et al. 2005), Chios-Cesme, Greece, in 1949 (Altinok et al. 2005), Agadir, Morocco, in 1960 (El Alami et al. 2004), El-Asnam, Algeria, in 1980 (Bertero and Shah 1983), Izmit, Turkey, in 1999 (Barka 1999), and more recently, Boumerdes, Algeria, in 2003 (Laouami et al. 2006). These six damaging earthquakes resulted in over 150,000 deaths and caused direct losses of more than USD 26.5 billion.

Magnitude and distance are the two key parameters that control the impact of an earthquake in an urbanized area. From 1900 to 2015, even moderate earthquakes (for example, the Agadir, Morocco, M 5.7 earthquake in 1960) have caused considerable economic loss and numerous fatalities, due to the vulnerability of traditional building stock. Although the number of potentially damaging earthquakes per year has neither increased nor decreased in recent decades, the vulnerability and exposure of communities have changed. According to the World Health Organization (WHO 2016), the urban population will continue to grow by approximately 1.5–1.6% per year from 2015 to 2030. As a result, about 2.8 million causalities due to earthquakes can be expected worldwide by 2100 (Holzer and Savage 2013). Because of the long return periods of the largest and most severe consequence earthquakes, and because few urban areas have yet to suffer such major events in their current configuration, Jackson (2006) noted that the greatest earthquake disasters appear to lie in the future.

The benefits of natural hazard mitigation programs have been fully recognized (Benson and Twigg 2004; Whitehead and Rose 2009), and earthquake loss models can provide relevant information for decision makers and policymakers (Guéguen et al. 2016). Moreover, the benefits of rapid and effective economic and human loss estimation modeling can be of utmost importance in the immediate aftermath of earthquake disasters (Erdik et al. 2011; Jaiswal and Wald 2013). Developing relevant earthquake loss models is challenging, because most economic and human loss models are based on post-earthquake interpretations, and the empirical relationships are derived from field observations that include substantial uncertainties (Brookshire et al. 1997). Before developing our earthquake loss models, the key questions that had to be addressed were: (1) how to represent the losses from the wealth of the region via a macroeconomic indicator; (2) how to express the economic losses by a homogeneous monetary value over the period of the earthquakes; (3) how to express the magnitude of the seismic hazard; and (4) which functional form of the loss model needed to be used?

There are various models in the literature for rapid assessment of earthquake losses based on economic and human predictors, such as gross domestic product (GDP) and population (Chen et al. 1997; Cha 1998; Chen et al. 2001; Dunbar et al. 2002; Heatwole and Rose 2013). Most of the models are reduced-form models, which consider only one or two parameters. Using a theoretical model, Schumacher and Strobl (2011) showed that economic loss (per capita) due to disasters increases with the size and density of the population in the affected area, and that GDP per capita and its squared value can show statistical significance for loss models. Chen et al. (1997) proposed a model for losses as a function of the occurrence probability of the intensity and a vulnerability function derived empirically by correlating reported losses from past earthquakes with the GDP of the affected area. Using only 29 earthquakes, Cha (1998) developed a log–log relationship between economic losses and GDP for each intensity. Heatwole and Rose (2013) proposed a specific reduced-form model based on U.S. earthquakes that considered a linear regression model of predictor variables to predict losses. Their study examined the population of the affected area, the magnitude of the earthquake, and the total economic losses adjusted to 2011 values using the Consumer Price Index. Jaiswal et al. (2011) published a worldwide model to predict economic impacts for any earthquake. They examined earthquakes that occurred between 1980 and 2007 in 119 countries through a combination of seismic intensity, spatial distribution of the population, and total GDP, as scaled using an exposure correction factor—ratio of wealth per capita to GDP per capita. The wealth estimate per capita was adjusted to the year 2000 value using values derived from the World Bank website.Footnote 1

Based on the observation that economic and social losses are generally related to physical damage caused to buildings (D’Ayala et al. 1997; Bommer et al. 2002; Coburn and Spence 2002; So and Spence 2013), models require a detailed inventory of buildings. Several such models have been developed to forecast socioeconomic losses; for example, the U.S. Federal Emergency Management Agency (FEMA) methodology (HAZUS) for the estimation of potential losses from disasters (FEMA 2016), and the Norwegian Seismic Array (NORSAR) methodology of seismic loss estimation (SELENA) that uses a logic tree approach (Molina et al. 2010). For these models, economic losses and casualties were computed based on building vulnerability and damage for a given seismic hazard. The requirement for a detailed building inventory can be a major drawback for extending loss models to specific areas because of the sheer number of different buildings and the complications involved in accurately assessing their seismic resistance (Guéguen 2013). But recent studies have shown the efficiency of data mining-based vulnerability assessment methods on a large scale that use elementary building characteristics, such as the number of floors and the roof shape. These structural features can easily be assessed by remote sensing or are contained in national databases (Riedel et al. 2014, 2015).

There are relatively few studies on human casualties in the literature. Samardjieva and Oike (1992), Samardjieva and Badal (2002), and Badal et al. (2005) analyzed and tested a model to estimate the (log value of the) number of casualties as a function of the magnitude and population density for earthquakes worldwide. Nichols and Beavers (2003) established a simple equation between fatalities and magnitude for U.S. earthquakes. Jaiswal and Wald (2010) developed a specific empirical model for estimation of earthquake fatalities based on the shaking intensity and the number of people exposed to earthquakes recorded between 1973 and 2007. Post-earthquake observations consistently show that there is strong correlation between fatalities and damage to buildings. After the Armenian earthquake in 1988, Armenian et al. (1997) reported that being inside a building at the time of the earthquake was the most efficient predictor for fatality, which was related to the damage grade. Coburn and Spence (2002) developed an empirical fatality model that considers the European Macro Scale damage grade (EMS98) as a predictor variable (that is, D4 for extensive damage and D5 for total collapse).

A consensus emerges with regard to the importance of building damage as a contributing factor for losses and the need to consider regional models based on regional earthquake data. Since it appears that losses are directly related to the amount of damage, construction quality will be a key factor, which requires limiting the model to a specific region wherein the structural design is comparable. Moreover, in our dataset, many moderate earthquakes—for example, the 1960 Agadir earthquake—caused significant economic and human losses associated with severe building damage. Consequently, earthquake characteristics—for example, magnitude, intensity—alone are not sufficient to derive earthquake loss models. Moreover, to help earthquake crisis management and the prediction of the recovery period, the number of people who lost their homes is another key piece of information.

The main objective of this study is to consider damage as a hazard-related parameter and to show improvement in the accuracy of loss estimation obtained in empirical models by using this parameter instead of intensity and/or magnitude alone as hazard-related parameters. We developed empirical models for economic losses and human losses (home loss, injuries, fatalities) from earthquakes as applied to the Mediterranean region through the compilation of data from different databases and countries. Particular attention has been paid to Algeria—a high exposure, seismic-prone region—to show the benefit of reducing uncertainties by considering regional events for loss models. Several models and exposure variables were also tested (population, GDP). In the following section, we present the data on seismological features (magnitude, location) and economic, social, and physical consequences. When the regression models are developed, the results and their interpretations are discussed. Finally, these models are applied to a case study of Constantine, Algeria using data from a large-scale vulnerability assessment that was carried out previously using data mining-based methods.

2 Mediterranean Earthquakes and Losses Data

To develop a seismic loss model for a specific region with any degree of reliability, historical data of earthquake impact must be carefully considered and compiled in a comprehensive way. This involves the review of existing earthquake catalogs, engineering damage reports, and databases of losses reported after earthquakes, among other sources. The objective of the present study was not to offer a comparative analysis of data quality, and the selection of the data sources here was based on their international use and dissemination, or considered as being authoritative at the national scale. For this study, hundreds of reports and publications were consulted, but only a few reports that described losses in detail were ultimately used.

The region considered is the Mediterranean region (Fig. 1), which includes several earthquake-prone countries and is exposed to both strong (for example, Italy, Algeria, Greece) and moderate (for example, France, Morocco, Libya) seismic hazard. The data are drawn from the period of 1900–2015, and are for earthquakes for which detailed loss reports were available. Following a scientific approach, only accessible, public documents were considered, and these are referenced to allow the traceability of the information used.

Fig. 1
figure 1

The 65 earthquakes (circles) in the Mediterranean region considered in this study, during the period 1900–2015

The characteristics of the earthquakes are given in Tables 1 and 2. Magnitude and intensity (the Mercalli modified intensity) are from the U.S. Geological Survey earthquake search catalogFootnote 2 for earthquakes after 1980, and from public reports or peer-reviewed articles for others. As an example, Benouar (1994) completed a database of the strongest Algerian earthquakes for the 1900–1980 period, the Algerian Technical Office (Azzouz 2002; Azzouz and Rebzani 2005) provided information on specific earthquakes, and on behalf of the French ministry in charge of natural disaster management, Payany (1983) provided information on the damaging earthquakes that occurred in France in 1909. The final dataset consisted of 65 earthquakes that occurred between 1908 and 2014, with maximal epicentral intensities ≥ VI (Table 1), and located in nine countries: Algeria (24), Italy (12), Greece (11), Turkey (11), Morocco (2), Spain (2), France, Libya, and Egypt (1 each). Figure 2a shows the distribution of the earthquakes per country. This study does not include all the damaging earthquakes in the Mediterranean region over this period, mainly because either no data or only data from largely uncertain sources were available for some in terms of economic and human losses.

Table 1 Mediterranean earthquakes from 1900 to 2015 compiled in the database, with the Algerian earthquakes listed separately at the end
Table 2 Population and damage for the Mediterranean for the 1900–2015 earthquakes compiled in the database, with the Algerian earthquakes listed separately at the end
Fig. 2
figure 2

Distribution of the earthquakes included in the present study: a by country in the Mediterranean region; bd by associated socioeconomic losses used and data source, for economic losses (b), home loss (c), and number of damaged buildings (d)

The Earthquake Engineering Research Institute (EERI) reports (EERI 1986, 1992, 1995, 1998, 1999, 2004, 2012) provided a lot of post-earthquake descriptions that were collected in the field soon after the events, and these were used to complete the seismological information in our study. In terms of damage and economic and human losses, the EERI reports are often limited to information available immediately after the event. Several months, or even years, can pass after an earthquake before this information can be considered to be comprehensive, and definitive missing information on total socioeconomic losses is never updated in these reports. Additional official databases were therefore examined to gather the missing information for this study: the International Disaster Database EM-DAT (Guha-Sapir et al. 2016), the Damaging Earthquakes Database CATDAT (Daniell et al. 2011; CATDAT 2016), and the Significant Earthquake Database of the National Geophysical Datacenter (NGDC 2016). In these databases, no information was available on the reliability and accuracy of the data that they contained, related to, for instance, the exposed populations, economic losses, and fatalities, as these databases consisted only of compilations of reports.

The EM-DAT database was created in 1998 and is maintained by the Centre for Research on the Epidemiology of Disasters in Brussels, Belgium. This database contains comprehensive data related to the occurrence and effects of over 18,000 natural disasters throughout the world, from 1900 until the present day. The EM-DAT data were cross-checked with other sources of information, such as UN agencies, nongovernmental organizations, and reinsurance companies. The NGDC database that was developed by the National Center for Environmental Information is similar to EM-DAT. It contains data on over 5700 destructive earthquakes from 2150 B.C. to the present, including information related to socioeconomic losses. CATDAT was created in 2010, and it contains over 20,000 sources of information that provide data on losses from over 12,000 historical damaging earthquakes (Daniell et al. 2011). In the present study, we considered information from CATDAT for earthquakes that occurred between 2011 and 2014. Finally, additional sources of information, such as public and official reports and peer-reviewed articles completed the loss databases used in this study (Table 2). Figure 2b–d shows the distribution of the earthquakes considered in the present study for each data source, in terms of economic losses (Fig. 2b), home loss (Fig. 2c), and number of damaged buildings (Fig. 2d).

The earthquakes were mainly located in the most seismically active countries of the Mediterranean region (Italy, Greece, Turkey, and Algeria), which represent 90% of the dataset. As particular attention was given to the Algerian region in the present study, 37% of the selected earthquakes were in Algeria. This proportion is not representative of the seismicity rate over this period, but is due to the information that was available to the authors. As shown in Fig. 2, more than 60% of the data for socioeconomic losses was drawn from the EM-DAT (Guha-Sapir et al. 2016) and NGDC (2016) databases. Because our objective was to construct the most comprehensive and traceable dataset possible, the data sources were cross-referenced according to their international audience or dissemination and validation at the international level, and references for the preferred source of information for earthquake occurrence were included in Tables 1 and 2.

3 Method

Based on data from the Mediterranean earthquake and loss database developed for this study, and using regression analysis between economic and human losses and hazard and exposure factors, empirical loss models for earthquakes in the Mediterranean region are developed in this study and their accuracy is examined. These models are then applied in a case study of Algeria for earthquake loss prediction. As a first step to develop empirical seismic loss models from the predictor variables available in the study database, the regression model forms proposed by Cha (1998) for GDP and seismic intensities, Heatwole and Rose (2013) for magnitude, population, and GDP, and Samardjieva and Badal (2002) for population and magnitude are adopted in this study.

Economic losses (L) are considered as a key dependent variable of the regression models. Daniell et al. (2010) reported that economic losses not adjusted to current USD values are a significant difficulty when compiling earthquake data from different epochs. To compare the consequences of earthquakes in a comprehensive manner, the actual economic losses were adjusted and normalized to a common and unique value (Daniell et al. 2011)—the 2015 USD currency. Nevertheless, for many countries around the Mediterranean (for example, France, Algeria, and Italy), this value was only available from the national census agencies after 1950, and GDP data before 1950 was provided by Maddison (2006). Inflation was allowed for, with the use of the Consumer Price Index inflation calculator from the Bureau of Labor Statistics, U.S. Department of LaborFootnote 3 to update the economic loss value adjusted from the year of the earthquake (L EQ ) to the present-day value (L $2015). For the Algerian earthquakes and the single French earthquake of Lambesc (1909), L EQ was given in currencies that no longer exist. Here, the French Institute of Statistics and Economic Studies converterFootnote 4 was used to convert L EQ into current currency in Euros on the date of the earthquake, which was then adjusted to the L $2015 (Table 2).

Home loss (H), injury (I), and fatality (F) are used as dependent variables of the regression models for human (social) losses, and their statistics came directly from the EM-DAT and NGDC databases for most of the earthquakes considered in the present study, or from referenced reports (Table 1).

Earthquake intensity and magnitude and building damage are selected as hazard variables. Damage (D) is usually reported as the total number of buildings that suffer damage. The present study determined the total number of buildings based on those having suffered heavy to strong damage, as classified by the damage grades D4 and D5 (Table 2), according to the EMS98 damage scale.

Population affected by the earthquakes (POP.Unit) is considered an important exposure variable, and these data came either from direct reports or other references (Table 2):

$$POP.Unit = POP_{Tot } \frac{{A_{Aff} }}{{A_{Tot} }}$$
(1)

where A Aff is the size of the affected area, A Tot is the surface area, and POP Tot is the population of the country, on the date of the earthquake. POP Tot was provided by the Population City database,Footnote 5 taking into account the population growth rate. A Aff was obtained either from references or through estimation from seismic intensity maps.

GDP per capita on the date of the earthquake (GDP PC ), as another key exposure variable, is scaled by the population (POP.Unit), and GDP of the affected area (GDP.Unit) is then estimated by multiplying GDP PC by POP.Unit, as follows and is presented in Table 2:

$$GDP.Unit = \frac{\text{GDP}}{{POP_{Tot} }} \times POP.Unit = GDP_{PC} \times POP.Unit$$
(2)

The GDP values adjusted to the USD 2015 values were obtained from the World Bank website.Footnote 6 For Algeria before 1960, this information was provided by Clerc (1975).

Because of the lack of information, earthquake-induced losses were not distinguished between those directly caused by the earthquake and those due to tsunamis and/or fires that accompanied the Messina earthquake in 1908. Indeed, Bird and Bommer (2004) reported a small contribution of secondary events to the total losses (except for mega-earthquakes, such as Indonesia 2004 and Japan 2011, which were not in the present study).

4 Results

Empirical relationships between loss variables and hazard and exposure factors are presented in this section. The accuracy of the relationships derived from the database information is estimated using the standard error (RMSE), the coefficient of determination (R 2), and the adjusted R 2.

4.1 Economic Losses (L $2015) Versus GDP and Intensity

The linear regression model proposed by Cha (1998) that considers the GDP of the country on the date of the earthquake, is as follows:

$${ \log }_{10} L_{EQ} = a\left( {MMI} \right) + b\left( {MMI} \right).{ \log }_{10} \left( {GDP} \right) + \varepsilon$$
(3)

where log10 indicates the decimal logarithmic function, a(MMI) and b(MMI) are the regression coefficients that depend on seismic intensity, ε is the RMSE, and L EQ and GDP are given in million USD.

In the Cha (1998) study, 29 earthquakes worldwide over the 1980–1995 period, of seismic intensities VIII, IX, and X were considered when deriving a loss empirical model. Figure 3 shows the log10 values of the losses in USD adjusted to the 2015 value (L $2015) versus the GDP.Unit predictor variable of the selected Mediterranean earthquakes and according to seismic intensities given in Table 1 and exposure-related variable in Table 2. Only 54 observations with VI < MMI < XI and with L $2015 and GDP.Unit information available are considered. Compared with Cha (1998), the linear regression models are different for seismic intensities (MMI) of VIII to X, which confirms the differences between the worldwide and regional models, and the need to derive a loss-prediction model for any specific area using the appropriate data (Heatwole and Rose 2013). For moderate seismic-prone regions, loss-prediction models for weak seismic intensity (VII) are also relevant, as shown by the moderate earthquake-induced losses referenced in the present dataset (Tables 1, 2).

Fig. 3
figure 3

Economic damage adjusted to 2015 USD value (L $2015) versus GDP.Unit with the regression models for seismic intensities VII–X of the Mediterranean earthquakes examined in this study, compared with the Cha (1998) empirical model results

Table 3 summarizes the results of the regression models for intensities VII–X using Eq. 3. Considering R 2, the model fits are relatively good, at 0.62–0.76, and is as good as for the models provided by Cha (1998). Regression for losses was also performed considering the exposed population as a predictor variable and using the same regression model form as Eq. 3. Table 4 gives the results of the regression models. As expected and reported by most previous studies, POP.Unit is not as relevant as GDP.Unit for the economic loss models. In this case, the fits of the R 2-based models fall around 0.5, which is notably lower than the values obtained by considering GDP.Unit.

Table 3 Regression of the economic losses adjusted to 2015 USD value (L $2015) considering GDP.Unit as the exposure-related variable for seismic intensities between VII and X (Eq. 3)
Table 4 Regression of the economic losses adjusted to 2015 USD value (L $2015) considering POP.Unit as the exposure-related variable for seismic intensities between VII and X

4.2 Economic Losses (L $2015) Versus Earthquake Magnitude, GDP, and Affected Population

Heatwole and Rose (2013) used the magnitude of U.S. earthquakes as a hazard-related predictor variable for loss prediction, and considered the population of the affected area and GDP as exposure-related variables. Assuming a non-normal distribution of the variables, they considered a natural-log model using a conventional form, as follows:

$$\log \left( {L_{\$ 2015} } \right) = a\log \left( {X1} \right) + b\log \left( {X2} \right) + c + \varepsilon ,$$
(4)

where log indicates the natural log; a, b, and c are the regression coefficients; \(\varepsilon\) is the RMSE; X1 is the hazard-related variable—magnitude or intensity; and X2 is the exposure-related variable—GDP.Unit or POP.Unit. The sparse nature of the data and the minimum number of predictor variables means that these variables are not normally distributed and are considered logged variables.

Table 5 summarizes the regression model coefficients following the original presentation by Heatwole and Rose (2013) using POP.Unit and GDP.Unit to predict the L $2015 losses. Only 57 observations with VI < MMI < XI and with L $2015 and GDP.Unit and POP.Unit information available are considered. As Heatwole and Rose (2013) reported, the regression fit coefficient (adjusted R 2) is higher for GDP.Unit (0.653) than for POP.Unit (0.527), because GDP.Unit is a macroeconomic variable that represents the wealth of the affected region, whereas the population is not. Compared with using the regression forms of Cha (1998) (Tables 3, 4), the same results are observed, which confirms the relevance of GDP rather than population for prediction of economic damage. For GDP.Unit, the coefficient of the regression fit (adjusted R 2) is of the same order of magnitude (0.610 versus 0.653) as for the regional U.S. earthquake-based model (Heatwole and Rose 2013), which also confirms the advantage of using regional earthquake loss databases. However, the magnitude characterizes the actual earthquake and does not take into account any seismic wave attenuation with distance, whereas the macroseismic intensity does. It is therefore reasonable to assume that intensity is a more relevant hazard-related predictor than magnitude. Table 6 gives the economic loss predictions with intensity as the hazard-related predictor variable, in the same manner as the formulation by Cha (1998). The fit model parameters (R 2, adjusted R 2) are higher with intensity than with magnitude, with values comparable to those from Eq. 3 considering GDP.Unit (Table 3), and R 2 close to 0.7. This result shows that intensity is a better hazard-related predictor variable for relevant loss assessment.

Table 5 Regression of the economic losses adjusted to 2015 USD value (L $2015) considering magnitude (M) as the hazard-related variable and GDP.Unit or POP.Unit as the exposure-related variable, and the coefficients of the regression models (Eq. 4)
Table 6 Regression of the economic losses adjusted to 2015 USD value (L $2015) considering intensity (MMI) as the hazard-related variable and GDP.Unit or POP.Unit as the exposure-related variable, and the coefficients of the regression models (Eq. 4)

4.3 Human Losses Considering Building Damage

Extensive studies have been carried out using building damage as a predictor variable for loss assessment (D’Ayala et al. 1997; Bommer et al. 2002; Goretti and Di Pasquale 2002; Kappos et al. 2006; So and Spence 2013; FEMA 2016), on the premise that damage to structures produces direct and indirect economic losses and fatalities. Damage can indeed be considered as a proxy of intensity, as macroseismic surveys completed after earthquakes are mainly derived from the observed damage for the strongest shaking (> VII). The amount of damage also affects the costs related to reconstruction and business interruption because of loss of use of damaged buildings.

In the present study, extensive data allowed the development and testing of regression models, with the loss estimation based on reduced forms, like those proposed by Heatwole and Rose (2013), but considered on a log10 scale. The human loss variables are the number of homeless and injured people, and the fatalities, while considering the affected population (POP.Unit) as an exposure-related variable. The hazard-related predictor variables are seismic intensity and number of damaged buildings classified as D4 + D5, according to EMS98. In our dataset, the quality of the data is relatively heterogeneous, with some references indicating the numbers of casualties with accuracy to the unit, while others indicating rounded quantities, and thus introducing uncertainty into the regression models. The primary objective of the present study was also to develop prediction models for the Mediterranean region, with special focus on Algeria, and here the Algerian earthquakes are defined separately from the total dataset in each analysis.

Figure 4 shows as an example the distribution of human losses versus damage (Fig. 4a) and intensity (Fig. 4b) for all of the earthquakes. Exposed population is not considered here.

Fig. 4
figure 4

Human losses (home loss, injuries, fatalities) versus damage (a) and intensity (b) as hazard-related predictor variables. The black lines correspond to the best-fit regression models

Table 7 summarizes all of the analysis results related to the regression fits, with RMSE and adjusted R 2. Table 7 and Fig. 4 show that for injuries (I) and fatalities (F), the models show low fits compared to those for home loss (H). Considering the whole dataset of earthquakes and intensities as the predictor variables (Fig. 4b, Table 7), low-fit models were derived, with adjusted R 2 of 0.23, 0.34, and 0.42 for the H, I, and F human loss variables, respectively. These values were significantly improved to become 0.72, 0.49, and 0.44, respectively when the damage was used as the predictor variable (Fig. 4a, Table 7). These models show the lowest RMSE values, which confirm the relevance of damage as a predictor of casualties. Furthermore, a reduced-form model that considered simultaneously intensity and damage was tested (Table 7), and the fits did not improve significantly (0.73, 0.53, and 0.54, respectively). Despite the significant relevance of damage as a hazard-related variable, the fits remained low, which resulted in imprecise assessment when using these models. However, when considering the Algerian earthquakes only (Table 7), the fits are considerably improved, reaching 0.98, 0.72, and 0.57, respectively, although this is based on a small amount of data and is not sufficiently representative statistically. This does suggest that regional data-based models that integrate urban features (regional typology, building design, and quality of construction) can improve loss prediction models that consider building damage, which is not the case for the reduced-form regression model that considers intensity and the Algerian earthquakes only (Table 7; with adjusted R 2 of 0.43, 0.28, and 0.37, respectively).

Table 7 Regression of the home loss (H), injuries (I), and fatalities (F) variables as human loss parameters using damage (D4 + D5), intensity (MMI), and population affected (POP.Unit) as the predictor variables for the earthquakes included in the Mediterranean database and for the Algerian earthquakes

The relevance of building damage as a predictor variable is also confirmed for economic losses. Table 8 gives the economic losses adjusted to 2015 USD value (L $2015) considering GDP.Unit and damage as the exposure-related and hazard-related variables of the reduced-form model. Compared with the fits given in Table 6, which consider intensity as the hazard-related variable, the adjusted R 2 increases to 0.72 (from 0.70) and the RMSE decreases a little to 0.55 (compared with 0.56).

Table 8 Regression of the economic loss adjusted to 2015 USD value (L $2015) considering GDP.Unit and damage (D4 + D5) as the exposure-related and hazard-related variables, respectively, and all of the earthquakes having these information (56 from Tables 1, 2)

As human losses depend on the exposed population, as suggested by Samardjieva and Badal (2002) for a worldwide model, here a Mediterranean model was also derived that considered the exposed population (POP.Unit) and damage. Table 7 presents the results of the regression, with significant improvements to the fits given previously in Table 7, whereby R 2 reached 0.77, 0.58, and 0.48, for H, I, and F, respectively. These values were further improved when only the Algerian earthquakes were considered (Table 7), with values of 0.98, 0.75, and 0.67, respectively, which is also the case when damage is used instead of intensity (Table 7).

The relevance of damaged buildings in models of economic and human losses is confirmed by this analysis as well as the evident needs to consider exposed population for human losses. Combining damage as hazard-related variable and GDP or exposed population as exposure-related variables improve the accuracy of the prediction. Nevertheless, damage prediction is more challenging than intensity or magnitude for real-time predictions and especially for regions with no available building inventory or large-scale vulnerability models. New, rapid methods for large-scale damage estimation are becoming more readily available. These practices are based on statistical analysis or data mining of urban data derived from satellite remote sensing (Saito and Spence 2011) or national census databases (Pittore and Wieland 2013; Riedel et al. 2014, 2015; Guettiche et al. 2017).

5 The Constantine (Algeria) Case

In this section, the loss models developed in this study are applied to the city of Constantine, Algeria. We considered the Constantine environment to be suitable for testing the empirical relationships derived in the previous section, without having as an objective a full and extended seismic-risk analysis that includes a site-specific hazard assessment. Constantine is the third largest city in Algeria in terms of population and economic activities. It is located in Algeria’s most seismically active region (acceleration, 0.129 g in the Algerian Seismic Hazard Map), and it has suffered several earthquakes in the past (Benouar 1994), including the strongest felt historical earthquake in 1985 of intensity VIII. Guettiche et al. (2017) performed a seismic vulnerability analysis of this area based on methods developed by Riedel et al. (2014) and Riedel et al. (2015), which were validated by comparing classical in situ engineering survey results. The elementary information provided by remote sensing and the national census (for example, period of construction, number of stories) was used to assess seismic vulnerability. We used the EMS98 damage matrix for intensities VII–X to compute seismic damage (D4 + D5) for each intensity. Our study is not focused primarily on the Constantine loss assessment for a realistic earthquake scenario, but more as a case study to provide insights on the uncertainties of the loss model related to the database. Only the damage-related scenarios given in Guettiche et al. (2017) were considered for testing the loss models for intensities VII–X.

The population of Constantine in 2008 was around 1 million, according to the National Office of Statistics.Footnote 7 The area considered in the present study is highly populated and is characterized by dense neighborhoods composed of old residential buildings. In terms of the urban surface area affected by the scenario earthquake defined by Guettiche et al. (2017), there is an exposed population (POP.Unit) of 100,000 inhabitants. The GDP per capita of Algeria is USD 4152.77Footnote 8 and for the area considered, GDP.Unit can be defined as follows:

$$GDP.Unit = POP.Unit \times USD4152.77 = USD415,277,000$$
(5)

The reduced-form models defined in Table 6 for the economic losses and Table 7 for the human losses (home loss, injuries, fatalities) were applied to the area considered for Constantine. Table 9 summarizes the result for the seismic intensities of VII–X. It provides information related to building damage and economic and/or human impact, which are also key elements in performance-based, earthquake-engineering approaches. In Constantine the total number of damaged buildings was between 195 (VII) and 1654 (X), which produced mean loss indicators for home loss, injured persons (injuries), and fatalities of 6067, 157, and 91, respectively, and economic losses of about USD 200 million.

Table 9 Home loss, injuries, fatalities, and economic losses (L $2015) computed in Constantine for macroseismic intensities VII–X using D4 + D5 as a hazard-related variable and POP.Unit or GDP.Unit as exposure-related variable, respectively

It is worth noting that the uncertainty here is relatively high, which results from the uncertainties related to the information in the database. However, decision makers are not directly concerned with magnitude, building damage, or intensity, but rather pay attention to performance indices for loss and injuries to occupants, or the homeless people that need to be sheltered in the event of a seismic disaster. Compared with previous studies, this integration of building damage as a hazard-related parameter improves the prediction, and should thus be incorporated into the analysis of earthquake losses. Furthermore, data collection after earthquakes should be improved.

6 Conclusions

This study provided a compilation of data and empirical models to estimate the economic and human impacts of earthquakes in the Mediterranean region. Earthquakes are complex and rare events, and post-earthquake information has to be carefully scrutinized before regression models can be developed. But there is the need to define performance metrics that are relevant for decision makers and seismic risk mitigation. Generally speaking, these metrics relate to the risks of casualties for anticipation of the short-term responses to an emergency, such as home loss and personal injuries, and anticipation of economic losses and fatalities for the long-term recovery processes. In this study, data related to 65 earthquakes in the Mediterranean region were compiled in a comprehensive database, with particular attention paid to Algeria. Reduced-form models were then derived, based on hazard-related and exposure-related variables. Because building damage is a key indicator related to losses, we have demonstrated that the integration of such information into regional models improves the loss assessment by reducing the computed regression errors R 2. These parameters require extensive analysis of the area considered in terms of building vulnerability, but new data sources such as remote sensing and national census information, when coupled with data mining algorithms, promise to provide new perspectives for cost-effective and relevant ways to determine seismic damage to buildings.

Vulnerability of buildings is an essential element, since damage seems to be the critical element to define the hazard-related parameters. Riedel et al. (2014, 2015), and Guettiche et al. (2017) have shown that large scale-based methods for vulnerability assessment can provide relevant forecasting of damage (D4 + D5) for a given seismic intensity. These methods are based on information that describes the general characteristics of the regional typology of structures and their particular attributes, such as those that might be obtained by satellite imagery. In this way, loss models can be improved everywhere by integrating the spatial variability of local vulnerability in the concerned regions. Efforts to study additional predictor variables, such as population mobility during the day (exposure-related) and physical characterization of the hazard that use ground-motion parameters, are necessary in further research.